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JELLYFISH GREEN FLUORESCENT PROTEIN (GFP) EXPRESSION IN PLANTS 
Field of the Invention 

This invention relates to improvements ,n gene expression, especially improvements in 
expres Sl on of the Green Fluoresecent Protein (GFP) gene, and to a method of detectme 
the presence and/or expression m a host of a gene of interest. ~ 

Background of the Invention 

Genes encoding ^glucuronidase and 0-galactosidase have been used as reporters for gene 
expression in plants. Using these reporter genes, transformed tissues or patterns of gene 
express 10n can be identified historically, but this is generally a destructive test and 
is not suitable for assaying primary transforms, nor for following the time course of 
gene expression in living plants, nor as a means of rapidly screening sesregating 
populations of seedlings. There is thus a general need for improved reporters of gene 
expression, but especially reporter genes for use in plants. Candidates might be found 
among proteins having intrinsic fluorescence. 

Proteins w.ch high tntnnsic fluorescence are involved in photosynthesis and 
biolummescence. and in most cases possess a protein-bound chromophore. For example, 
the highly fluorescent phycobiliproteins require complex tetrapyrrole groups, and the blue 
and yellow fluorescent proteins from Vibrio fischeri must bind lumazine and flavin 
mononucleotide, respectively. This requirement for an external chromophore complicates 
the use of these proteins as reporters for gene expression. However, the green fluorescent 
protein (GFP) from the jellyfish Aequorea Victoria does not share this requirement for an 
external chromophore. 

Aequorea viaona are brightly luminescent, with light appearing as glowing points around 
the margm of the jellyfish umbrella. Light arises from yellow tissue masses which each 
consist of about 6000-7000 photogenic cells (Davenport & Nichol. 1955 Proc. Roy. Soc. 
Ser. B 144. 399-411,. The cytoplasm of these cells is densely packed w,th fine granules 
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of about 0.2 M m diameter which are enclosed by a unit membrane and contain the 
components necessary for bioluminescence (Anderson &. Cormier. 1973 J. Biol. Chem. 
248. 2937-2943). The components include a Ca" activated photoprotein. aequonn. that 
emits blue-green light, and an accessory green fluorescent protein (GFP) which accepts 
energy from aequorin and re -emits it as green light. 

GFP is an extremely stable protein of 238 ammo acids. The fluorescent properties of the 
protein are unaffected by prolonged treatment with 6M guanidine HC1. 8M urea or \ % 
SDS, and two day treatment with various proteases such as trypsin, chymotrypsin, 
papain, subtilisin. [hemolysin and pancreatin at concentrations up to I mg/ml fail to alter 
the intensity of GFP fluorescence. GFP is stable in neutral buffers up to 65 °C. and 
displays a broad range of pH stability from 5.5 to 12 (Bokman & Ward. 1981 Biochem. 
Biophys. Res. Comm. 101, 1372-1380). The protein is intensely fluorescent. w lt h a 
quantum efficiency of approximately 80% and molar extinction coefficient of around 
4.5xlO J . GFP absorbs light maximally at 395 nm and has a smaller absorbance peak at 
475nm, and fluorescence emission peaks at 509nm, with a shoulder at 540nm (Morise 
et al., 1974 Biochemistry 13, 2656-2662). Researchers have successfully cloned and 
sequenced both the cDNA and genomic DNA sequences coding for A. victona GFP 
(Prasher et al., 1992 Gene 111, 229-233). 

The fluorescence of GFP has been well characterised (Inouye & Tsuji. 1994 FEBS 
Letters 13817. 277-280 and FEBS Letters 14472, 211-214) and appears to be due to a 
unique covalently-attached chromophore which is formed post-translationally by 
cyclisation and oxidation of the residues 65-67 (Ser-Tyr-Gly) within the protein (Cody 
et al., 1993 Biochemistry 32, 1212-1218; Heim et al.. 1994 PNAS 91. 12501-12504). 
Several genomic and cDNA clones of gfp have been obtained from a population of A. 
victona. The gfp gene contains at least three introns. and the sequences derived from the 
cDNA have been used for protein expression studies in Escherichia coli. Caenorhabditis 
elegans (Chalfie et al.. 1994 Science 263. 802 et seq.) and Drosophila melanogaster 
(Wang &. Hazelrigg. 1994 Nature 369. 400 et seq.). Fluorescent protein was produced 
in these different cell types and there appears to be little requirement for specific 
additional factors for post-translational modification of the protein, which may be 



WO 96/27675 W W PCT/GB96/0048 1 



autocatalytic or require common factors. Recently. modif ie d forms of GFP have also 
been made and studied, which forms have rather different fluorescence characteristics 
compared to the wild type (Heim et al., 1994 PNAS 91. 12501-12504; Delagrave et al 
1995 Bio/Technology 13. 151 et seq- and Heim et al. 1995 Nature 373. 663-664). 

Although GFP has some advantages as a fluorescent reporter molecule, expression has 
been reported to be problematic in some experimental systems (Cubirt et al. , 1995 Trends 
Biochem. Sci. 20, 448-455). Expression of GFP in mammalian cells has been described 
as highly variable (Rizutto et a!.. 1995 PNAS 92, 11899-11903); Kaether & Gerdes 
1995, FEBS Lea. 369, 267-271; Pines 1995. Trends Genet. 326-327) often requiring 
a strong promoter and decreased incubation temperature for good results (Oeawa et al 
1995 PNAS 92. 11899-11903). Other researchers have found that a lower incubation 
temperature also favours the development of fluorescence during expression of GFP in 
bactena (Heim et al., 1994 PNAS 91, 12501-12504; Webb et al., 1995 J. Bact. 177, 
5906-5911) and yeast (Lim et al., 1995 J. Biochem. 118, 13-17). In yeast, this 
phenomenon has been attributed primarily to more efficient maturation of GFP to the 
fluorescent form at lower temperatures. The present inventors have sought to express 
modified forms of GFP in various hosts. 

Summary of the Invention 



In a first aspect the invention provides a DNA sequence encoding Green Fluorescent 
Protein (GFP), the sequence being modified relative to the wild type sequence so as to 
allow for more efficient expression in a plant cell of a functional GFP polypeptide. 
Preferably the modified sequence is capable of efficient expression in a dicotyledonous 
plant, such as Arabidopsis. 

The term "GFP" as used herein is intended to refer to a polypeptide possessing many of 
the properties of the naturally occurring protein, and particularly exhibiting intrinsic 
fluorescence. As will be apparent to those skilled in the art. the polypeptide need not 
necessarily fluoresce in the "green" pan of the visible spectrum, as the fluorescence 
properties of the polypeptide (including the wavelength of fluorescence; may be 
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substantially altered by one or more mutations. 



The GFP polypeptide will generally have substantially the ammo acid sequence of the 
wild type protein (as disclosed by Prasher et al). but will preferably comprise one or 
more amino acid differences defined, and discussed in greater detail, below. 

The GFP-coding DNA sequence is advantageously modified so as to reduce the 
probability of an RNA sequence transcribed therefrom being subject to erroneous splicing 
in a plant cell. The DNA sequence is conveniently modified so as to comprise a plurality 
of nucleotide substitutions relative to the wild type sequence, which substitutions serve 
to reduce, or preferably entirely prevent, excision from the transcribed RNA of the 
portion corresponding to nucleotides 400-483 of the DNA sequence. It has surprisingly 
been found by the present inventors that this portion of the sequence tends to be 
recognised in plant cells (particularly dicotyledenous plants) as an intron, which is 
therefore excised by splicing of the RNA. 

Preferably the nucleotide substitutions in the DNA sequence serve to decrease the A/U% 
content of the transcribed RNA, which is believed to decrease the likelihood of the 
sequence being treated by a plant cell as representing an intron. Desirably the 
substitutions particularly decrease the A/U% content of the region corresponding to 
nucleotides 400-483. Conveniently the nucleotide substitutions are such as to preserve 
the amino acid sequence of the encoded polypeptide substantially unchanged in the 
portion encoded by nucleotides 400-483. Other substitutions may advantageously be 
made to decrease the similarity between the GFP RNA sequence and the plant intron 
recognition consensus sequence (see Figure 2). 

In addition to nucleotide substitutions in the portion 400-483 of the DNA sequence, a 
number of other modifications may advantageously be made. For example, substitutions 
may be made downstream (i.e. 3') and/or upstream (i.e. 5") of nucleotides 400-483. 
which substitutions will conveniently serve to further reduce the A/U content of the 
transcribed RNA. 
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The invention also includes within its scope RNA sequences capable of being transcribed 
from the modified DNA sequence (i.e. RNA sequences transcribed from the modified 
DNA sequence, or an RNA having a sequence such that it could be synthesised by 
transcription from the modified DNA sequence). 

Advantageously a number of other modifications, in addition to those specified above, 
may also be made to the GFP-coding sequence. For example, the sequence will typically 
further comprise transcription and translation signals (e.g. promoters, enhancers) and/or 
localisation signals recognised in plants. Localisation signals may direct the expressed 
polypeptide to: the nucleus (e.g. SV40 large T antigen localisation signal, particularly in 
combination with other polypeptide sequences, which have been found to increase the 
efficiency of the signalling); mitochondria (e.g. cytochrome C oxidase subunit IV); 
endoplasmic reticulum - ER (e.g. the signal sequence from carboxypeptidase Y, or that 
from Arabidopsis basic chitinase); or microbodies (e.g. peroxisomes). It may even be 
possible to target the modified GFP polypeptide to the plant cell wall. Localisation of 
GFP may be highly desirable when expression occurs at high levels, so as to minimise 
possible toxicity to host cells. 

The sequence may advantageously be further modified in accordance with the manner 
described in the prior an (e.g. as disclosed by Heim et al, 1994, 1995. or Delagrave et 
al.. 1995. as cited previously). Whilst in general the DNA sequence is modified in such 
a way as to preserve the wild type amino acid sequence, it has been found that amino 
acid changes at specific residues are in fact desirable. In particular, the sequence may 
be modified so as to comprise an amino acid substitution at one or both of amino acid 
residues 163 and 175. Changes at these positions are found to alter the characteristics 
of the polypeptide in an unexpected and favourable manner. 

In particular, amino acid substitutions at residue 163 and/or 175 (valine and serine 
respectively, in the wild type protein) have favourable effects on the characteristics of the 
GFP polypeptide when expressed in many different host cells (e.g. bacterial, yeast etc.). 
and such substitutions may advantageously be included independently of any modification 
of the DNA sequence made for increased efficiency of expression in plants. 
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In a second aspect therefore the invention provides a modified GFP polypeptide 
comprising an amino acid substitution relative to the wild type protein at residue 163 
and/or 175. Substitution at either residue, in isolation, has surprisingly been found to 
increase the thermotolerance of the polypeptide. Maximal thermotolerance-.s obtained 
by causing substitution at both residues. 

Advantageously valine 163 is substituted by alanine, or by a related amino acid (i.e. those 
having an aliphatic side chain: glycine, leucine and isoleucine; or those having an 
aliphatic hydronyl side chain: senne and threonine). 

Advantageously serine 175 is substituted by glycine, or by a related amino acid (i.e. 
those having an aliphatic side chain: alanine, leucine and isoleucine; or those having an 
aliphatic hydroxyl side chain: serine and threonine). 

It is preferred that the modified GFP polypeptide comprises substitutions at both residues, 
conveniently 163— alanine and 175— glycine. 

The modified GFP may additionally comprise other sequence differences relative to the 
wild type protein, particularly in. or immediately adjacent to. residues 65-67 (which 
residues give rise to the chromophore). 

Such a nucleic acid sequence is useful for example, as a marker, or as a reporter gene, 
in a wide variety of host cells (e.g. mammalian, bacterial, fungal, yeast or plant cells). 
Conveniently, the nucleic acid sequence may be further modified for expression in a 
particular host cell. For example, where the thermotolerant GFP-coding sequence is to 
be expressed in a plant ceil it will conveniently be modified m accordance with the first 
aspect of the invention. 

In a third aspect, the invention provides a nucleic acid construct comprising a nucleic acid 
sequence in accordance with the first aspect of the invention. In particular the construct 
is preferably an expression vector, comprising one or more regulatory signals (such as 
promoters etc.) and is preferably suitable for use in a plant cell. The construct will 
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desirably include one or more restriction endonuclease sites, suitable for the insertion into 
the construct of other nucle.c acid sequences, wh.ch in a preferred embodiment mav be 
inserted in frame with the sequence of the invention. 

The invention also provides a host cell, conveniently a plant cell, into wh.ch has been 
introduced a sequence in accordance with the first aspect of the invention. 

Processes for introducing DNA into plant cells (typically by transformation) are not 
1007c efficient. Accordingly, it is generally desirable for the DNA introduced into the 
plant cell to confer one or more distinctive characteristics upon the plant cell, which 
characteristic(s) serve to mark those cells which have taken up the DNA. The fluorescent 
properties of GFP constitute such a distinctive characteristic (marker"). In a preferred 
embodiment the invention thus provides a plant cell transformation vector comprising the 
sequence of the invention. Further, the invention provides a method of screening plant 
cells, comprising introducing mto at least some of a plurality of plant cells a DNA 
construct comprising a sequence in accordance with the invention, maintaining the cells 
under suitable conditions for an appropriate length of time so as to allow expression of 
a modified GFP from the construct, and selecting those cells which exhibit GFP-mediated 
fluorescence. "Suitable conditions" and "an appropriate length of time" are well known 
to those skilled in the an from standard texts. 

In a preferred embodunent, the vector further comprises a sequence of interest which, 
preferably, is present in frame with the modified GFP-coding sequence. 

In a fourth aspect the invention thus provides a method of detecting the expression in a 
plant of a sequence of interest, compnsing causing the sequence of interest to be present 
in frame with a modified GFP-coding sequence in accordance with the first aspect of the 
invention so as to form a modified GFP/sequence of interest fusion, introducing the 
fusion into a plant, and monitonng the fluorescence thereof. GFP-mediated fluorescence 
is thus an indicator of expression of the sequence of interest. 

in yet another aspect, the invention provides a nucleic acid construct compnsing a 
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sequence in accordance with the second aspect of the invention. The nucleic acid 
construct will desirably have many of the features of the nucleic acid construct in 
accordance with the third aspect of the invention. It will be apparent however that the 
construct may be useful in many different types of host cell, and may be constructed 
accordingly. 



The invention will now be further described by way of illustration and with reference to 
the accompanying drawings, in which: 

Figure 1A shows the sequences introduced, via PCR, flanking the GFP-coding sequence; 

Figure IB is a confocal micrograph of transformed yeast cells expressing GFP; 

Figure 2A is a photograph showing agarose gel electrophoresis analysis of PCR products; 

Figure 2B is a schematic illustration of the portion of DNA not represented in the mis- 
spliced mRNA produced in plants from the wild type GFP-coding sequence; 

Figure 3A is a photograph showing the DNA sequence determination of the reverse 
transcript produced from mis-spliced mRNA; 

Figure 3B is a comparison between a portion of the GFP wild type sequence and a plant 
intron consensus sequence; 

Figure 4 shows a comparison of pan of the wild type A. victoria GFP sequence with a 
modified GFP-coding sequence in accordance with the invention; 

Figure 5 is a series of confocal micrographs (at different magnification) showing pans 
of a plant expressing a modified GFP-coding sequence in accordance with the invention: 

Figure 6 shows a comparison of pan of three modified GFP-coding sequences in 
accordance with the invention, together with the amino acid sequences encoded thereby; 
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Figure 7 is a graph of relative fluorescence (arbitary units) against time (mmutes) for £. 
coli stra.ns expressing modified GFP (open squares) or modified, mutated GFP (filled 



circles); 



Figure 8 is a photograph of a Western blot, probed with anti-GFP antibody; 

Figure 9 is a bar chart showing the amount of fluorescence associated with cultures 
expressing modified GFP (open columns) on modified, mutated GFP (shaded columns) 
incubated at four different temperatures; 

Figure 10 is a graph of fluorescence against time (minutes) for yeast cultures at 25 °C or 
37°C. the cultures having been grown initially in anaerobic conditions, with oxygen 
introduced at time zero; 

Figure 11 is a picture of a Western blot showing expression of modified GFP or 
modified, mutated GFP (GFPA) by E. coli cultures at 25 or 37°C. with comparison 
between soluble and insoluble culture fractions; 

Figure 12 is a graph of absorbance against wavelength for soluble (filled circles) or 
insoluble (open circles) GFP; 

Figure 13 is a picture of yeast cultures, grown at 25 or 37 °C and expressing modified 
GFP, or modified, mutated GFP (GFPA); 

Figure 14 is a graph of fluorescence against wavelength (tun), showing the excitation 
spectra (squares) and emission spectra (circles) respectively, of modified GFP (solid 
lines.) and two mutated forms of modified GFP. GFPA (dashed lines) and GFP5 (dotted 
lines); 



Figure 15 is a comparison of the nucleotide sequence of wild-type gfp and a modified 
aene m-gfp5, and the polypeptides encoded thereby. Nucleotide sequence differences are 
shown in bold. The m-GFP5 ammo acid sequence is shown beneath the nucleotide 
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sequence. The three amino acid differences between the encoded polypeptides are 
indicated; 

F.gure 16 shows the sequence of another modified m-gfp gene, termed m-gfp'5-ER. and 
the amino acid sequence of the polypeptide encoded thereby: and 

Figure 17 shows a number of confocal mtcrographs (A-H) of Arabidopsis seedlings 
expressing modified gfp genes in accordance with the invention. 

Examples 
Example 1 

Construction of a gfp expression cassette 

A synthetic gfp gene was constructed using the polymerase chain reaction (PCR). The 
plasmid pGFPlO.l (described by Prasher et al., 1992 Gene 111, cited above) contains 
a cloned A. Victoria gfp cDNA, and was used as template for PCR amplification (with 
Thermococcus litoralis Vent polymerase) with synthetic oligomer primers which were 
used to incorporate new sequences flanking the GFP coding sequence. 

The sequence of the primer oligonucleotides was: 

GGCGGATCCAAGGAGATATAACAATGAGTAAAGGAGAAGAACTTTTCACT (Seq. 
ID No. 1) and GGCGAGCTCTTATTTGTATAGTTCATCCATGCC (Seq. ID No. 2). 

The newly- incorporated sequences are shown in Figure I A. Referring to Figure 1A. the 
sequence existmg in pGFPlO.l is shown italicised. The added sequences are shown in 
normal type. These included: recognition sites for the restriction endonucleases BamHl 
and iad placed at the 5' and 3' termini of the amplified fragment: a Shine-Delaamo 
nbosome binding sue (RBS) sequence positioned upstream of the initiation codon to 
ensure efficient translation of the transcribed gene in E. coli. and the sequence AACA 
inserted between positions -4 and -1 for efficient translation in plants. 



PCR-amplified fragment was subcloned into pUCH9 for bacterial expression, and 
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into an episomal yeast plasmid vector. pVT103-U (Vernei et al. . 1987 Gene 52. 225-^33) 
which contains a yeast 2,M origin of replication and a truncated form of the veast ADHl 
promoter to allow high level expression of the cloned GFP gene in Saccharoses 
cerevisiae. S. cerevisiae MGLD-4a (a. Ieu2. ura3. His3. irpl. Iys2) cells were 
transformed using the lithium acetate method described by Ito et al. (1983). 

Transformed E. coll and yeast bearing the recombinant plasmid were observed to produce 
brilliantly fluorescent colonies under long wavelength UV illumination using a hand-held 
lamp. Interestingly, a high degree of sectoring was seen in yeast colomes contain^ the 
episomal form of the gfp cDNA; the sectoring was elinunated by integration of the gfp 
cDNA at the yeast URA3 locus and presumably reflects the instability of the 2 M M-based 
episome and/or some toxic effects of GFP expression. This observation also indicated 
the utility of GFP as a simple cell-autonomous marker. Examination of transformed 
yeast cells by confocal microscopy showed that the protein was predominantly distributed 
throughout the cytoplasm (Fig. IB). 

After the PCR amplified gfp cDNA was shown to correctly produce fluorescent protein 
product in yeast, the sequence was inserted between the BamRl and Sad sites in the plant 
transformation vector pBI121 behind the 35S promoter (Jefferson et al., 1987). A. 
tumefasctens strain LBA4044 was transformed with the ^-containing plasmid bv 
electroporation. Roots of Arabidopsis thaliana C24 were transformed using the protocol 
of Valvekans et al. (1988). Transgenic callus and shoots were screened for GFP 
expression using an inverted fluorescence miscroscope (Leitz DM-IL) fined with an 
appropriate filter set (Leitz-D). However, at no stage during the transformation 
procedure was GFP-related fluorescence detected by UV lamp illumination, or by 
epifluorescence microscopy. This lack of fluorescence was unexpected and surprising m 
view of the fluorescence exhibited previously by the transformed bacteria and yeast cells. 

gfp is mis-spliced in Arabidopsis 

The successful expression of GFP in Arabidopsis requires proper producuon of the 
apoprotein, before post-translational modification to form the chromophore. The 
inventors therefore used PCR-based methods to verify the correct insertion of the 35S 
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promoter-driven £/p cDNA. and to check mRNA transcription and processing in 
transformed plantlets. Nucleic acids were extracted from piantlets and either treated with 
RNase. or DNase treated and reverse transcribed using oligo(dT) 8 primer. The gfp ' 
sequences in these extracts were therefore derived from genomic DNA or transcribed 
mRNAs. respectively. The gfp sequence was PCR-ampIified from these separate extracts 
and products were analysed by restriction endonuclease digestion, as shown in Figure 2A. 

In Figure 2A. the RNase-treated DNA sample ("mRNA") -is shown on the left of each- 
pair of samples, whilst the DNase-treated/reverse transcribed sample is shown on the 
right of each pair. The samples were either loaded onto gel without prior restriction 
("uncut", extreme left hand pair of samples) or loaded after prior digestion with (from 
left to right): Ncol; Rsal; Dral, Accl; Hindi or Avail. It can be seen that whilst the 
expected product was obtained after amplification of the gene. RT-PCR of mRNA 
sequences gave rise to a truncated product. Thus product was 80-90 base pairs shorter 
than expected and was uncut by the restriction endonucleases Dral and Accl. whilst the 
gene sequences contained unique recognition sites for these enzymes. The inventors 
established that this is consistent with a small deletion within the gfp coding sequence as 
shown in Figure 2B. In this figure, the shaded portion represents that missing from the 
mRNA-derived RT-PCR amplified sequences. 

The shortened RT-PCR product was cloned and sequenced (Fig. 3A). and a deletion of 
84 nucleotides between residues 400-483 was located. The nucleotide sequences 
bordering this deletion are shown in figure 3B. and demonstrate similarity to known plant 
introns. The sequence across the splice site (marked with an arrow in Figure 3A) thus 
reads (5 to 3') ...AG/ AC... Matches were found for important residues at the 5' and 
3' splice sites (reviewed by Luefrsen ei al.. 1994) and the excised gfp sequence contains 
a high predicted A:U content (6S%) which has also been shown to be important for 
recognition of plant introns i Wiebauer et al. . 1988; HanJey & Schuler. 1988: Goodall & 
Filipowitz. 1989. 1991). It is therefore likely that this 84 nucleotide region of the 
jellyfish gfp cDNA sequence is recognised as an intron when transcribed m Arabidopsis. 
resulting in the production of a defective protein product. It should be noted that the 
borders of this cryptic intron do not coincide with any of the natural spliced junctions 



W0 96/27675W W PCT/GB96/00481 



found after processing of the gfp mRNA in A. victoria. 



Modification of the gfp gene 

The jellyfish gene was mutated to produce a modified gfp (m-gfp) suitable for expression 
in Arabidopsis. as described below. 

Two mutagenic oligonucleotides were synthesized, a 122-mer: 

GATCATATGAAGCGGCACGACTTCTTCAAGAGCGCCATGCCTGAGGGATACG 

TGCAGGAGAGGACCATCTTCTTCAAGGACGACGGGAACTACAAGACACGTG 
CTGAAGTCAAGTTTGAGGG (Seq. ID No. 3), 

and a 126-mer: 

GATGTATACGTTGTGGGAGTTGTAGTTGTATTCCAACTTGTGGCCGAGGATG 
TTTCCGTCCTCCTTGAAATCGATTCCCTTAAGCTCGATCCTGTTGACGAGGGT 
GTCTCCCT.CAAACTTGACTTC (Seq. ID No. 4). 

The oligonucleotides were purified by electrophoresis in a 5% polyacrylamide gel 
containing TBE and 7M urea. The gel was stained briefly with 0.05% toluidine blue, 
and the full-length oligonucleotides were excised, and eluted overnight in 0.5M 
ammonium acetate, O.lmM Na : EDTA, 0.1% SDS. The oligonucleotides share 17 
nucleotides of complementarity at their 3' termini, and were annealed and elongated after 
several rounds of thermal cycling with Vent polymerase. The extended product was 
cloned between the Nde I and Acc I sites of gfp. The mutant clones were screened for 
the presence of the diagnostic restriction endonuclease sites. Cla I. Ava II, and the 
desired fragment (m-gfp) was subcloned into M13 and its sequence verified by DNA 
sequencing using the dideoxynucleotide chain termination technique with T7 DNA 
polymerase. 

The modifications introduced by the synthetic oligonucleotides were intended to alter the 
sequences which might be involved in 5' splice site recognition and to decrease the A:U 
content of the putative intron. as shown in Figure 4. In the Fieure. the upper DNA 
sequence is that of a portion of the wild type A. victoria GFP. The lower DNA sequence 
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«s that or a portion of a modified GFP-coding sequence. The correspond^ ammo acid 
sequence is shown beneath the DNA sequences. Modified nucleotides are shown 
outlined. All DNA modifications affect only codon usage, and the .-^-encoded am.no 
acd sequence is identical to that of the wild-type jellyfish polvpeptide. The "pseudo- 
mtron" sequence is underlined and the cryptic splice junctions are arrowed. Nucleotide 
and amino acid residue numbering (on the left and right, respectively, of the oblique 
stroke} start from the initiation codon. The boxed hexanucleotide sequences are Ndel and 
Accl recognition sites respectively. 

The m-gfp sequence was insened behind the 35S promoter in pBI121 . and introduced into 
Arabidopsis using the root transformation technique. Brightly green fluorescent cells 
were seen after co-cultivat.on with Agrobaaenum. As shoot regeneration processed 
explams with different levels of green fluorescence could be observed. Regenerating 
callus and shoots develop a bright red autofluorescence due to the formation of 
chlorophyll withtn the tissues, and with the brightest m-gfp transforms the green 
fluorescence was clearly detectable against this autofluorescent background using a hand 
held UV lamp. This was sunilar to the levels of green fluorescence seen in transformed 
yeast and E. coli. However, these very bright Arabidopsis transformants regenerated and 
set seed rather poorly. Nevertheless, seeds were obtained from over 50 transformed 
lines, allowed to germinate, and screened by epifluorescence microscopy. Several of the 
brightest lines were used for confocal laser scanning microscopy. 

Confocal microscopy of living plants 

The fluorescence properties of GFP and chlorophyll allow the use of fluorescence 
microscopes equipped with common filter sets for fluorescein and rhodam.ne for dual 
imaging m plant cells. Intact five day old m-g/p-transformed Arabidopsis seedlinas were 
mounted in water for confocal laser scanning microscopy. GFP fluorescence could be 
clearly visualised in the transformed tissues, and chloroplasts provided a very effective 
counter fluor in the upper pans of the plant. Optical sectioning of the m-gfp transformed 
plants gives selective visual access to the internal details of living plant structure, as 
shown in Figure 5. without any need for staining or d.ssecuon. For example, median 
longitudinal sections of root tips can be simply obtained by adjusting the microscope 
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depth of focus, and confocal imaging allows the resolution of subcellular details. GFP 
is found throughout the cytoplasm, but appears to accumulate within the nucleoplasm. It 
appears excluded from vacuoles, organelles and other bodies in the cytoplasm, and is 
excluded from the nucleolus. Similarly, in optical sect.ons of cotyledon and hypocotyl 
tissues. GFP is found throughout the cytoplasm and nucleoplasm. The relationship of 
cells within the tissues is clearly discernible. In highly vacuolate epidermal cells in the 
root and hypocotyl. GFP fluorescence allows visualisation of trans-vacuolate cytoplasmic 
threads, and the thin cytoplasmic strands which underly the cell wall and which may be 
aligned with cytoskeletal elements. The movement of organelles through cytoplasmic 
streaming could also be observed in these living cells. 

Example 2 



Isolation and characterisation of a bright mutant of GFP 

The sequence of m-gfp was mutated by PCR in the presence of limiting nucleotide 
concentrations. The template plasmid was P BSm-gfp4. a derivative of TU#65 (Chalfie 
et al, 1994 Science 263, 802-805) in which gfp has been replaced with m-gfp. The 
pnmers used were the T3 and T7 primers (New England Biolabs) that are complementary 
to the flanking T3 and T7 promoters present in the vector sequence. Four separate 
reactions (30 cycles of 30 seconds at 94°C. 30 seconds at 55°C and 1 min at 72°C using 
Taq DNA Polymerase from Promega) were carried out. each with the concentration of 
a different nucleotide reduced from 200 M M to 20 M M. The amplified fragments were 
pooled, cleaved with Kpn\ and EcoRI and cloned downstream of the lac promoter of 
pBluescript II KS ( + ) (Strategene). 

The mutant library- thus obtained was transformed into E. coli strain XL 1 -Blue 
(Stratagene) and incubated overnight at 37 °C on TYE agar containing 50 M g/ml 
ampicillin and 1 mM IPTG. Colonies were illuminated with a long wavelength UV lamp 
(UVP Model B 100 AP) and visually screened for increased fluorescence. The coding 
regions of two of the brightest mutant genes (m-gfpA and m-gfpB) thus identified, as well 
as that of m-gfp. were amplified by PCR (30 cycles of 1 mm at 94°C. 1 min at 55°C and 
1 mm at 72°C using VENT DNA Polymerase from New England Biolabs) using primers 
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that generate a BamHl-Sacl fragment containing the gfp coding sequence downstream of 
a phage Shine-Dalgarno sequence and a plant translat.on initiation context sequence. The 
forward primer (5'Bam-GFP) was 

5"-GGCSay^A4GG/!GATATAACAATGAGT.AAAGGAGAAGMCTmCACT-3- 

(Seq ID No. 5. BamHl site underlined. Shine-Dalgarno sequence in italics and 
translation initiation context sequence in bold) and the reverse primer (GFPoSac) was 
5^GGCGAGCJCTTATTTGTATAGTTCATCCATGCC-3- (Seq ID No. 6. Sad site 
underlined and GFP stop codon in bold). The amplified fragments obtained from the 
three reactions were cleaved with BamHl and Sad and cloned downstream of the lac 
promoter of pUC119 (Vieira and Messing. 1987 Methods Enzymol. 153. 3-11). 

The positions of the mutations responsible for the bright phenorypes of m-gfpA and m- 
gfpB were then localised by recombination of the mutant genes with m-gfp. The P UC119 
derivatives containing m-gfpA and m-gfpB were cleaved with either BamHl and Ncol. 
Ncol and Clal. or Oal and Sad. The restriction fragments were gel purified and ligated 
to the m-gfp pUCll9 derivative that had been cleaved with the same combination of 
enzymes and gel purified. These and the parent constructs were introduced into XL1- 
Blue cells and incubated overnight at 37°C on agar plates containing arnpicillin and 
IPTG. Comparison of the fluorescence of colonies containing the various constructs 
revealed that the mutation(s) responsible for the bright phenorypes of both m-gfpA and 
m-gfpB were contained within the 336 bp Clal- Sad fragment at the 3' end of the gene. 
These fragments were cloned into the phage vector Ml3mpl9 (New England Biolabs) and 
sequenced from the Universal primer using the Sequenase Version 2.0 DNA Sequencing 
Kit (United States Biochemical Corporation). 

Sequencing of the Clal-Sad fragment of m-gfpB revealed the presence of a single coding 
alteration. V163A. This same change was found in combination with a second coding 
alteration. S175G. in the Gal- Sad fragment of m-gfpA. The sequences of.mGFP. 
mGFPB and mGFPA are compared in Figure 6. when shows the location of the two 
mutations. The S175G change would appear to contribute to the phenotype of GFP A as 
cells expressing the GFPA protein were clearly more fluorescent than those expressing 
GFPB (data not shown). Thus, only GFPA was analysed further. 
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The sequence of the m-gfp gene was modified so as to code for the V163A and S175G 
substitutions described above and the I167T substitution described in the prior an (1994 
Hetm et al.. Proc. Natl. Acad. Sc. USA 91. 12.501-12.504. which substitution inverts 
the ratio of the 400-475nm excitation peaks), as well as to further alter the codon usaee 
of the gene in order to eliminate potential plant intron sequences generated bv the 
introduction of these mutations. The sequence differences between this modified gene 
termed. m-gfp5, and the original gfp gene, and their respective polypeptides\re 
summarised in Fig. 15 (Seq. ID Nos. 5-8). 

The m-gfp5 gene was constructed by PCR amplification (30 cycles of 30 sees at 94°C, 
30 sees at 55°C and 30 sees at 72°C using VENT DNA Polymerase) of m-gfp using 
mutagenic primers. The fonvard pnmer was an oligo corresponding to nucleotides 445- 
560 of the m-gfp5 coding sequence shown in Figure 15 and the reverse primer was GFP- 
3'Sac. The amplified fragment was cleaved and exchanged with the Accl-Sacl fragment 
of m-gfp to create m-gfp5. 

For bacterial expression studies. BamHl-Sacl PCR fragments containing the m-gfp, m- 
gfpA and m-gfp5 genes were cloned downstream of the tac promoter of the expression 
vector pSE380 (Invitrogen). to give the plasmids pSE-GFP, pSE-GFPA and pSE-GFP5. 
respectively. Expression from the tac promoter of pSE380 is tightly regulated due to the 
presence on the plasmid of the laclq gene. For yeast expression, the same PCR 
fragments containing the m-gfp, m-gfpA and m-gfp5 genes were inserted downstream of 
the constitutive ADH1 promoter of pVT103-U (Vemet et al., 1987). a yeast multicopy 
episomal plasmid containing the URA3 selectable marker. The resulting plasmids were 
pVT-GFP. pVT-GFPA and pVT-GFP5. respectively. 

To assess the difference in fluorescence between strains expressing modified GFP and 
GFPA quantitatively, the inventors introduced expression plasmids pSE-GFP and pSE- 
GFPA into E. coli strain XL 1 -Blue and measured the fluorescence (X„ = 397 nm. \ m 
= 508 nm» of equal optical densities of cells at various times following IPTG-induction 
of protein synthesis at 37*C (Fig. 7V 4.5 hrs after induction, cells expressing m-GFPA 
were observed to fluoresce approximately 20-fold more intensively than those expressing 
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m-GFP. a figure which increased to approximately 35-fold by the tune the cells had 
entered stationary phase (9 hrs after induction). 

To determine whether the enhanced fluorescence of cells expressing m-GFPA might be 
due to increased levels of protein expression, total protein was prepared from cells from 
the 4.5 hr time point and the amount of intracellular GFP estimated by Western blot 
analysis. As can be seen in Fig. 8. m-GFPA accumulates to a significantly higher level 
than modified GFP. The "vector" track is a negative control. The numbers on the right 
of the blot represent the molecular weights of known standards. However, the difference 
in protein levels as estimated by quantification of band intensities. 2.4-fold, is not nearly 
enough to account for the approximately 20-fold difference in fluorescence ' levels 
observed at this time point. This result suggests that a large proportion of GFP that is 
expressed in cells at 37 'C is non-fluorescent and that the substitutions present in m- 
GFPA enhance the maturation of the protein to the fluorescent form. Comparison of the 
growth curves of strains expressing m-GFP and m-GFPA with the growth curve of a non- 
expressing strain (data not shown) indicated that expression of these proteins does not 
have any adverse effects on the growth of bacterial cells. 

The amino acid substitutions present in m-GFPA suppress the temperature-sensitivity 
of GFP maturation 

Lim and co-workers (Lim et al. . cited above) have recently reported that maturation of 
GFP to the fluorescent form is sensitive to temperature during expression in the yeast 
Saccharomyces cerevisiae. To test whether the same may be true during expression in 
E. coli and whether the substitutions present in m-GFPA enhance maturation by 
suppressing any such sensitivity, the inventors examined expression of m-GFP and m- 
GFPA over a range of different temperatures. Strains containing pSE-GFP and pSE- 
GFPA were induced overnight at temperatures ranging between 25 °C and 42 C C. For 
each culture, the fluorescence of equal optical densities of cells was measured and the 
amount of intracellular GFP determined by Western blot analysis (Table D. 
Fluorescence values were then normalised against the amount of GFP present inside cells 
so as to give a relative measure of the proportion of intracellular GFP that is fluorescent 
for each culture. The results (Fig. 9) clearly show that the proportion of modified GFP 
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chat is fluorescent steadily decreases wuh increasing incubation temperature (open 
columns), indicating that maturation of the protein to the fluorescent form is temperature 
sens.tive. In contrast, the substitutions present ,n m-GFPA (shaded columns, suppress 
this sensitivity to temperature, wuh maturation being optimal at 37°C. " 



Table 1 



Temperature 


Fluorescence 


Relative amount of intracellular 


(°C) 


(arbitary units) 


GFP (units) 




m-GFP 


m-GFPA 


m-GFP 


m-GFPA 


25 


328.4 


722.2 


0.29 


0.58 


30 


100.5 


541.1 


0.21 


0.82 


37 


67.9 


2273.0 


0.23 


1.00 (7.8xlO i ) 


42 


9.2 


369.4 


0.17 


0.44 



Investigation into the thermosensitivity of GFP maturation 

The post-translational maturation of GFP to the fluorescent form involves a number of 
steps. The fust step, presumably, is folding of the apoprotein into a catalytic 
conformation that facilitates the novel reactions involved in formation of the 
chromophore. These reactions consist of cyclisation and oxidation of the tnpeptide 
Ser65-Tyr66-Gly67 to give a p-hydroxybenzylidene-imidazolidinone structure. Once the 
chromophore has been formed, it is then only fluorescent once GFP has adopted a fold 
which protects it from solvent effects. In principle, any of these processes could be 
sensitive to temperature and thus be responsible for the observed thermosensitivity of 
GFP maturation. 

Since the oxidation reaction involved in chromophore formation appears to require 
molecular oxygen. Heim and co-workers (Heim et al.. 1995 Nature 373. 663-664) have 
been able to measure the reaction rate by expressing GFP in E. coli under anaerobic 
conditions and then monitoring the development of fluorescence after adm.ssion of air. 
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To determine whether this reaction might be temperature sensitive and whether the 
substitutions present in m-GFPA act by enhancing its rate at higher temperatures, we 
measured the rates of oxidation of m-GFP and m-GFPA at both 25 °C and 37°C. For 
these experiments a yeast expression system was used, which provided for better growth 
and expression levels than £. coli under anaerobic conditions. Strains of Saccharomyces 
cerevisiae MGLD-4a (MATa. leul ura3. his3. trpl. l ys 2) containing either pVT-GFP or 
pVT-GFPA were incubated anaerobically (Becton-Dickinson BBL GasPak Pouch) 
overnight at 30°C in synthetic drop-out media lacking uracil (Rose et al., 1990 "Methods 
in Yeast Genetics. A Laboratory Course Manual", Cold Spnng Harbor Laboratory 
Press. Cold Spnng Harbor. USA). Following admission of air to the pouch. 1.0 ml of 
each culture was immediately centrifuged for 1 mm at 13.000 rpm and resuspended in 
0.5 ml aerated and prewarmed PBS (pH 7.4) containing 8 mM NaN, as a metabolic 
inhibitor. Cell suspensions were placed immediately into pre-warmed cuvettes held 
within the fluorimeter carousel and the tune course of fluorescence (X = 397 nm X 

* cm 

= 508 nm) development measured. 

As reported previously by Heim et al., each oxidation proceeded as a simple first order 
reaction (Fig. 10). Figure 10 shows the rate of fluorescence development for cultures 
expressing modified GFP at 25°C (crosses) or 37°C (triangles), or cultures expressing 
GFPA at 25°C (squares) or 37°C (circles). 

The time constant measured for the oxidation of m-GFP at 37°C (5.9 ± 0.1 nun) was 
found to be approximately 3-fold faster than that measured at 25°C (16.2 +. 0.3 min). 
indicating that the post-translational oxidation of the GFP chromophore is not the step 
responsible for the temperature sensitivity of maturation. In confirmation of this 
conclusion, the time constants derived for m-GFPA at both 25 °C and 37 °C (22.5 ■>- 1.4 
min and 18.1 - 0.4 min. respectively) were actually slower than those measured for m- 
GFP. 

Heim and co-workers have also reported that some GFP forms non-fluorescent inclusion 
bodies during expression in E. colt, indicating that not all GFP folds properly under these 
conditions. To determine whether the proper folding of m-GFP might be temperature 
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sensitive and whether the substitutions present in m-GFP A act bv enhance proper 
folding at increased temperatures, the inventors examined the solubilities of the two 
proteins during expression in E. coli at 25=C and 37°C. Bactenal cells express.ns m- 
GFP or m-GFPA were grown overnight at either 25'C or 37°C. lysed, and ihe soluble 
and insoluble fractions separated by centnfugation. 

Specifically, cells containing pSE-GFP or P SE-GFPA were grown in 1.5 ml of 2xTY 
broth to an absorbance of 0.2 at 600 nm and then induced overnight with 0.2 mM IPTG : 
The cultures were centrifuged at 13.000 rpm for 2 mm. resuspended in 500 M | 50 mM 
Tris-HCl (pH 8.0). 2 mM EDTA. 100 M g/ml lysozyme, 0.1% Tncon X-100 and 
incubated at 30°C for 15 min. Cells were then lysed by sonication (5 x 15 sees) using 
a Heat Systems (Model CL4) sonicator and centrifuged at 13.000 rpm for 15 mm at 4°C. 
The supernatant (soluble fraction) was removed and stored at -70°C until used. The 
pellet (insoluble fraction) was washed once with 500 M l 50 mM Tris-HCl (pH 8.0), 10 
mM EDTA. 0.5% Triton X-100. resclubilised for 1 hr at room temperature in 500 M l 
resolubilisation buffer (8 M urea. 0.1 M NaH : P0 4 , lOmM Tris-HCl. pH 6.3) and stored 
at -70°C until used. The amount of m-GFP or m-GFPA present in each fraction was 
then estimated by Western blot analysis (Fie. 11). 

SDS-PAGE and Western blot analysis were earned out according to normal procedure 
(Sambrook et al.. 1989 "Molecular Clorung. A Laboratory Manual". Cold Spring 
Harbor Laboratory Press. Cold Spring Harbor. USA). Primary antibodies were 
polyclonal rabbit anti-GFP (generous gift of S. Santa-Cruz) used at a dilution of 1/2.000. 
Antibod.es were detected with iodinated Protein A (Amersham) and bands v.sualised and 
quantified using a Molecular Dynamics Phosphonmager. In Figure 11. insoluble and 
soluble fractions are denoted by the letters I and S respectively, with cultures grown at 
25°C shown on the left, and cultures grown at 37°C shown on the right. In all cases, 
fluorescence was found almost exclusively in the soluble fraction. At 25 8 C. both m-GFP 
and m-GFPA were found predominantly in the soluble fraction, indicating that proper 
folding of both proteins is efficient at this temperature. At 37°C. however, the majority 
of m-GFP was found as aggregated protein in the insoluble fraction, whereas most of m- 
GFPA was still present in the soluble fraction. This result indicates that the temperature 
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sensitivity of m-GFP maturation is due primarily to improper protein foldins at higher 
temperatures and that this defect is suppressed by the am.no acid substitutions present in 
m-GFPA. 

To gain information on which spec.es in the maturation pathway of GFP aggregates at 
higher temperatures, the inventors made use of the characteristic absorption "Jf the GFP 
chromophore ,n either the mature (Ward & Bokman, 1982 Biochemistry 21. 4535-4540) 
or chemically reduced state (Inouye & Tsuji 1994 FEBS Lett. 351. 211-214). If the- 
aggregating species has already undergone the cyclisation reaction. GFP isolated from 
inclusion bodies should show this characteristic absorption. To facilitate the purification 
of protein for absorbance measurements, the inventors fused a poiyhistidine tag to the C- 
terminus of m-GFP. 

Histidine-tagging was achieved by the addition of 6 histidine codons to the 3' ends of the 
modified gfp genes by PCR. The genes were amplified using 5'Bam-GFP as the forward 
primer and the oligo 

5 - GCCGAGCTCTTAGTGGTGGTGGTGGTGGTG 
TTTGTATAGTTC ATCC ATGCC -3 ' 
(Seq ID No. 7. Sad site underlined, histidine codons in bold) as the reverse primer. 
The amplified fragments were cleaved with BamHl and Sad and cloned downstream of 
the tac promoter of pSE380 (Invitrogen) to give the expression plasmids pSE-GFPHis. 
pSE-GFPAHis and pSE-GFP5His respectively. 

For the purification of histidine-tagged GFP for absorbance measurements, soluble and 
insoluble fractions of cells containing pSE-GFPHis grown at 25 °C and 37°C, 
respectively, were prepared as described previously. GFP was purified from the fractions 
on Ni-chelate columns using the Ni-NTA Spin Kit (Q.ageni. Purification from the 
soluble fraction was carried out according to the protocol for the purification of histidine- 
tagged proteins under native conditions. After clearance of cellular debris from the 
insoluble fraction by centnfugation at 13.000 rpm for 30 min. purification was carried 
out according to the protocol for purification of histidine-tagged proteins under denaturing 
conditions, except that the protein was eluted with resolubilisation buffer containing 250 
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mM imidazole. 



For the purification of histidine-tagged proteins for fluorescence spectroscopy, cells were 
grown in 100 ml of 2xTY broth at 37°C to an absorbance of 0.2 at 600 nm and then 
induced overnight with 0.5 mM IPTG. Cells were harvested by centnfugation at 6.000 
rpm for 10 min and lysed by resuspension in 4 ml 20 mM Tris-HCl (pH 7.9). 500 mM 
NaCl, 5 mM imidazole. 0.1% sarkosyl. 0.1% deoxycholate. 2.25 M Guanidine-HCl. 
Nucleic acids were precipitated by the addition of 5ml isopropanol and removed b V : 
centnfugation at 10,000 rpm for 10 min. Fluorescent histidine-tagged proteins were 
purified from the supernatant on Ni-chelate columns (Qiagen) and eluted with 2ml of 
20mM Tns-HCl ( P H 7.9). 500 mM NaCl. 150 mM imidazole. For all purifications, 
protein purity was assayed by SDS-PAGE and found to be >95%. Protein 
concentrations were determined by Bradford assay (Bio-Rad Protein Assay kit) using 
bovine serum albumin as a standard. 

Absorbance spectra were recorded on a Cary 3 UV-Visible Spectrophotometer (Variaa) 
at 25 9 C. The optical pathlength was 1 cm. Fluorescence spectra were recorded on a 
Hitachi F-4500 fluorimeter at 25°C using 4mm/10mm cuvettes. The bandpass for both 
the excitation and the emission monochromators was 5 nm. the scan speed 240 nm per 
min and the response time automatically adapted by the device. All spectra were 
corrected following the supplier s procedure for calibration of the fluorimeter using 
Rhodamine-B as standard. Emission spectra were recorded at a fixed wavelength of the 
excitation maximum, excitation spectra at a fixed wavelength of the emission maximum. 

Histidine-taggmg of GFP did not detectably affect the temperature sensitiviry of 
maturation of the protein (data not shown). The absorption spectra of equal 
concentrations of denatured protein from the two preparations were then recorded, as 
described above. The results are shown in Figure 12. which is a graph of absorbance 
against wavelength inm). 

As can be seen in Fig. 12. denatured fluorescent protein derived from the soluble fraction 
of cells grown at 25 °C shows a characteristic absorption peak similar to that of the m- 
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GFP chromophore at acid pH (Ward & Bokman. cued above,. On the other hand 
protein purified from inclusion bod.es of ceils grown at 37°C shows no such absorption 
indicating that the aggregating species has not formed a chromophore. Taken toeether 
the results presented above indicate that the temperature sensitive of m-GFP maturation 
is due primarily to the failure of the unmodified apoprotein to fold into its catalvticaily 
active conformation at higher temperatures. Furthermore, the amino acid substirutions 
present in m-GFPA suppress this defect by enhancing proper folding at elevated 
temperatures. 

If thermosensitiviry of m-GFP maturation observed m the yeast Saccharomces cerevisiae 
is also a result of the thermosensitiviry of apoprotein folding, it should be suppressed by 
the substitutions present in m-GFPA. To test this prediction, the inventors incubated 
strains of cerevisiae containing either P VT-GFP or P VT-GFPA on agar plates at either 
25°C or 37-C. As can be seen in Fig. 13. the substitutions present in m-GFPA also 
suppress the thermosensitiviry of m-GFP expression in yeast. This result indicates that 
the temperature-dependent mis-folding of the m-GFP apoprotein is not simply an artefact 
of an E. coli overexpression system, but is also the basis for the thermosensitiviry of m- 
GFP maturation in a heterologous eukaryotic system. 

Modification of the fluorescence spectra of m-GFPA 

Fluorescence spectroscopy of purified histidme-tagged m-GFP and m-GFPA revealed that 
the fluorescence spectra of m-GFPA are essentially unchanged from those of m-GFP 
except for a decrease in the amplitude of the 475 nm excitation peak relative to the 
amplitude of the 400 nm excitation peak (Fig. 14). Although this spectral chanee is 
advantageous for applications which utilise 400 nm excitation, it is also detrimental for 
those which utilise 475 nm excitation. For many experiments, the ideal spectral variant 
would be a protein which could be efficiently excited at either of these wavelengths. This 
characteristic would afford greater flexibility with regard to the range of applications in 
which the protein could be used. 

Recently. ,t has been demonstrated that, as observed here, the relative amplitudes of the 
excitation peaks of GFP can be altered by means of mutagenesis (Ehri 2 ei ai. 1995 
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FEBS Lett. 367, 163-166; Delagrave et ai, 1995 Bio/Technology 13. 151-154). A 
number of these mutations, like the substitutions present in m-GFPA. are located in the 
C-terminal region of the protein. It has been hypothesised that, in the three-dimensional 
structure of GFP. the C-terminal region is close to the chromophore and that mutations 
in this region can affect the microenvironmem of the chromophore so as to influence the 
equilibrium between the two tautomeric forms of the chromophore that are responsible 
for the two excitation peaks (Heim et al., 1994 & Ehrig et ai, 1995). One of these 
mutations. I167T. invens the ratio of the 400 nm to 475 nm excitation peak heights. If 
the effects of mutations in the C-terminal region of GFP on the spectroscopic state of the 
chromophore are additive, then it is possible that combination of the I167T substitution 
with the substitutions present in GFP A might increase the amplitude of the 475 nm peak 
relative to the 400 nm peak. 

Histidine-tagged m-GFP5 was purified and its excitation and emission spectra analysed 
by fluorescence spectroscopy. As can be seen in Fig. 14, m-GFP5 has rw 0 excitation 
peaks (maxima at 395 nm and 473 nm) of almost exactly equal amplitude and an emission 
spectrum largely unchanged from that of m-GFP. To determine whether m-GFP5 has 
retained the thermotoleram phenorype of GFPA. bacterial cells containing pSE-GFP or 
an expression plasmid containing m-gfp5 (pSE-GFP5) were induced with IPTG for 5 
hours at 37°C. The fluorescence (X„ = 395 nm or 473 nm. \ em = 507 nm) of equal 
optical densities of cells was then measured. Cells expressing m-GFP5 were observed 
to fluoresce 39-fold more intensely than cells expressing m-GFP when excited at 395 nm 
and 1 1 1 -fold more intensely when excited at 473 nm. These results indicate that m-GFP5 
has not only retained the thermotoleram phenorype of m-GFPA. but has improved upon 
it. 

Further modification of mgfpS was achieved. Two synthetic oligonucleotides were made, 
to act as mutagenic PCR primers to add an in-frame £coRJ site at the 5' end of the gene 
and to add a sequence coding for the amino acid tag HDEL at the C terminal of the 
protein .which tag acts as an endoplasmic reticulum localisation signal). 

The PCR mutagenised sequence was then used in a three-way ligation reaction with 
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BamHVSacl - cut vector and a pair of synthetic oligonucleotides with BamHl/EcoRl ends. 
The synthetic oligos had the sequences: 

5' GGC GGA TCC AAG GAG ATA TAA CAA TGA AGA CTA 
ATC TTT TTC TCT TTC TCA TCT TTT CAC y fSeq ID No. 8) and 

5" GCC GAA TTC GGC CGA GGA TAA TGA TAG 
GAG .AAG TGA AAA GAT GAG AAA GA 3' (Seq ID No. 9) 

The oligos were annealed, extended with Klenow polymerase and cut with BamHl/EcoRl. 
When ligated with the m-gfp5HDEL gene, they introduced the signal sequence from 
Arabidopsis chitinase at the 5' end of the coding sequence. The nucleotide sequence of 
the resulting modified gene im-gfp5-ER). and the amino acid sequence of its polypeptide 
product (Seq ID No.s 10 and 11 respectively), are shown in Figure 16. The nucleotides 
encoding the signal sequence are shown in upper case letters, whiolst the rest of the 
sequence is in lower case letters. The C terminal HDEL tag on the protein is apparent. 

The modified gene, when expressed in Arabidopsis. gave highly efficient concentration 
of GFP-mediated fluorescence in the endoplasmic reticulum (Figure 17). Referring to 
Figure 17. the panels illustrate confocal micrographs of 5-day old A. thaliana seedlings 
expressing m-GFP (panels A-D) or m-GFP5-ER (panels E-H), imaged at 395nm 
excitation wavelength. Panels A & E are sections of the junction between the hypocotyl 
and the cotyledon (bar = 25 M M); panels B & F show hypocotyl epidermal cells (bar = 
IOmM); panels C & G show median longitudinal sections of root tips (bar = 25 M M); and 
panels D & H show roots, with nucleoplasms accumulation of mGFP (in D) and 
retention in the endoplasmic reticulum of mGFP5-ER (in H). (bar = IOmM). 



DISCUSSION 



GFP expression in plants 

The objective of this work was to begin the development of gfp for use as a genetic 
marker in transformation and as a reporter for localised gene expression in Arabidopsis 
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and other plants. In order to successfully employ the gfp cDNA ,n plants, three major 
steps need to be addressed. 

( 1) The GFP apoprotein must be produced in suitable amounts within the plant cells. 

(2) The apoprotein must undergo efficient post-translationai cyciisat.on and oxidation to 
produce the mature GFP. 

(3) The fluorescent protein may need to be suitably targeted within the cell, to allow 
efficient post-translational processing, safe accumulation to high levels, or to allow easier 
distinction of expressing cells. 

The inventors have shown that expression of the jellyfish gfp cDNA in Arabidopsis is 
curtailed by aberrant splicing, with an 84 nucleotide intron being efficiently excised from 
within the GFP coding sequence. The recognition of introns in plant pre-mRNAs 
primarily requires conserved sequences found adjacent to the 5' and 3' splice sites, which 
are related to those found in other eukaryotes. and. arypically, a high A.U content within 
the intron. The inventors altered potential recognition sequences at the 5' splice site, and 
decreased the A:U content of the cryptic intron by in vitro mutagenesis to produce a 
modified m-gfp gene which was successfully expressed in transgenic Arabidopsis plants. 
U is likely that this m-gfp gene will be useful for expression studies in other plants, which 
appear to share similar feacures involved in intron recognition. 

It is also possible that aberrant splicing may interfere with GFP expression in other 
organisms. However, introns found in yeast possess a requirement for conserved 
sequences located at the branch point, and introns found in animal cells (including 
jellyfish) share a conserved polypyrimidine tract adjacent to the 3' splice site. The lack 
of these additional features may allow correct processing of the gfp mRNA in fungal and 
animal cells. 

With expression of the m-gfp gene in Arabidopsis. it has now been shown that the 
apoGFP readily undergoes maturation, and that the fluorescent form of the protein 
accumulates in transformed cells. Transformed cells were often intensely fluorescent, and 
were eas.ly detectable by eye using a long-wave UV lamp. However it proved difficult 
to efficiently regenerate fertile plants from the brightest transformams. with cells 
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remaining as a highly fluorescent callus or mass of shoots after several months of culture. 
It is possible that high levels of GFP expression were mildly toxic or interfered with 
differentiation, due perhaps to the fluorescent or autocatalytic properties of the protein. 
In the narural situation, in jellyfish photocytes where high levels of GFP. are well 
tolerated, the protein is found sequestered in microbody-like lumisomes. In contrast, the 
mature protein is found throughout the cytoplasm and nucleoplasm in transformed 
Arabidopsis. If GFP is a source of fluorescence-related free radicals, for example, it 
might be advisable to target the protein to a more localised compartment within the plant, 
cell. Appropriate localisation signals are known to those skillled in the an and it should 
prove possible to incorporate these into the GFP polypeptide without unduly disrupting 
the fluorescence characteristics of the protein. 

The inventors have adapted the green fluorescent protein (GFP) of Aequoria victoria for 
use as a genetic marker in Arabidopsis thaiiana. Transcripts of the jellyfish GFP coding 
sequence are mis-spliced in Arabidopsis, with an 84 nucleotide intron being efficiently 
excised. A modified version of the gfp sequence has been constructed to destroy this 
cryptic intron, and to restore proper expression of the protein in plant cells. GFP is 
mainly localised within the nucleoplasm and cytoplasm within transformed Arabidopsis 
cells, and its presence allows optical sectioning of intact plants using confocal laser 
scanrung microscopy. The modified gfp sequence may be useful for directly monitoring 
gene expression and protein localization at high resolution, and as a simply scored genetic 
marker in living plants. 

A major use for m-gfp would be as a replacement for the /3-glucuronidase gene, used as 
a reporter for promoter and gene fusions in transformed plants. Histochemical staining 
is used to identify ceils expressing the GUS gene product, but a fluorescent product can 
be imaged directly and rapidly. Gene expression and protein localization can be observed 
in physiologically active cells without a prolonged and lethal staining procedure, and 
fluorescence microscopy techniques allow the high resolution imaging of GFP-expressina 
cells. In addition, it becomes feasible to follow dynamic events in living cells and 
tissues. 



WO 96/2767 



PCT/GB96/00481 



High levels of fluorescence intensity are obtained in GFP-transformed bacterial and yeast 
colonies allowing simple screening for GFP expression with the use of a hand-held UV 
lamp. Such an assay for gene expression in living plants would be a very useful tool for 
plant transformation experiments. Many transformation techniques give rise to 
regenerating tissues which are variable or chimeric, and require testing of^the progeny 
of the primary transformants. Potentially, m-^-transformed tissues could be monitored 
using in vivo fluorescence, avoiding any need for destructive testing, and the appropriate 
transforms could be rescued and directly grown to seed. Similarly, in vlvo 
fluorescence would be an easily scored marker for field testing in plant breeding, 
allowing m-g/p-linked transgenes to be easily followed. 

Use of a confocal laser scanning microscope will allow the clear analysis of plant tissue 
whole-mounts despite the refractile nature and light scattering (and for some cells, 
autofiuorescent) properties of plant cell walls. 

In this study, the inventors have also shown that maturation of GFP in E. coli is sensitive 
to temperature, due primarily to the mis-folding of the apoprotein into inclusion bodies 
at elevated temperatures. They have also described rwo mutants. m-GFPA and rn-GFP5, 
whose folding is thermotolerant. Presumably, the charateristic of the GFP apoprotein 
that causes it to aggregate at higher temperatures and the mechanism by which the 
mutations present in m-GFPA and m-GFP5 suppress this effect is unknown. However, 
studies on the effects of mutations on the tendency of some other proteins to aggregate 
(Thomas et al.. 1995 Trends Biochem. Sci. 20, 456-459; Mitraki et al.. 1991 Science 
253, 54-58; Wetzel 1994 Trends Biotech. 12. 193-198; and Chrunyk et al. . 1993 J. Biol. 
Chem. 268. 18053-18061) suggest a number of possibilities. 

The simplest explanation is that the native apoprotein or one of its folding intermediates 
is thermodynamically unstable and the protein aggregates when in the unfolded state. 
The substitutions present in the thermotolerant mutants could suppress the characteristic 
either by increasing the thermodynamic stability of the unstable species or by decreasing 
its steady state level by increasing the rate of chromophore cyclisation. Alternatively, 
higher temperatures can allow proteins to overcome the thermodynamic bamers to the 
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formation of off-pathway folding intermediates which may become kineticallv trapped bv 
aggregation. It is possible, therefore, that the subst.rutions present in the heat-tolerant 
mutants act by suppressing such a phenomenon by directing folding aiong the correct 
pathway at elevated temperatures. It is also possible that the kinetic half -life 0 f an 
aggregation-prone intermediate in the normal apoprotein folding pathway is significantly 
increased at higher temperatures. Mutations could suppress this characteristic" either bv 
decreasing the half life of such an intermediate or by reducing its tendency to aggregate. 
Finally, the suppressor mutations may favour proper folding at higher temperatures by- 
increasing the affinity of the apoprotein for a molecular chaperone. 

To differentiate between these different possibilities, biophysical analyses of variants of 
GFP and the heat tolerant mutants that cannot undergo cyclisat.on and are thus trapped 
as apoproteins will be required. However, the observation that the substitutions present 
in m-GFPA increase the T m of mature GFP by 4.0°C (data not shown) provides 
preliminary evidence that they may act by increasing the thermodynamic stability of the 
native apoprotein. As well as enhancing proper folding, the substitutions present in m- 
GFPA contribute to the bright phenotype of the mutant protein by facilitating its 
accumulation to higher levels than m-GFP (Table 1). This observation indicates that GFP 
is turned over more rapidly than m-GFPA, probably because partially or mis-folded m- 
GFP apoprotein that does not aggregate would be degraded by the cellular proteolytic 
machinery. 

The mventors have shown that oxidation of the GFP chromophore does not contribute to 
the temperature sensitivity of maturation by measuring the reaction rate in yeast cells at 
both 25°C and 37"C (Fig. 3». An interesting point arising from this experiment is that 
the time constants derived for m-GFP at both 25=C and 37>C (5.9 ± 0.1 min and 16.2 
- 0.3 min. respectively) are significantly faster than the 120 mm estimated for the 
oxidation of GFP in bacteria by Heim et al. This observation may reflect a difference 
in the physiological states of yeast and bacterial cells following anaerobic growth or 
perhaps the presence of a catalysing factor in yeast cells. 

Nevertheless, these results suggest that the oxidation of the GFP chromophore has the 
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capacity to proceed at a much higher rate than previously thought. Therefore, in some 
cases, the factor which limits how quickly fluorescent can be observed following protein 
synthesis may be the efficiency with which the apoprotein folds rather than the time taken 
for oxidation of the chromophore. 

Examination of the fluorescence spectra of the m-GFPA revealed a decrease in the 
amplitude of the 475nm excitation peak relative to the amplitude of the 400nm excitation 
peak. This result indicates that mutations in the C-terminai region of GFP are able to 
modulate the spectroscopic state of the chromophore by affecting its local environment 
within the protein. The inventors have utilised this phenomenon to engineer the 
fluorescence spectra of m-GFP5 by introducing a third substitution, I167T, into the C- 
terminal region of m-GFPA. m-GFP5 has two excitation peaks (maxima at 395nm and 
473nm) of almost exactly equal amplirude and is thus ideal as a multi-purpose spectral 
variant which can be used for applications requiring both UV and blue excitation. Since 
the substitutions present in m-GFPA and m-GFP5 affect the environment of the 
chromophore, it is likely that they also influence the intrinsic brightness of the mutant 
proteins by affecting the extinction coefficients and/or quantum yields of the chromophore 
at the two excitation wavelengths. However, to measure an extinction coefficient 
accurately, one must be certain that every GFP molecule in a given sample is mature. 
The results presented here suggest that, even in a soluble fraction, there may be 
appreciable amounts of mis-folded or non-fluorescent apoprotein. In support of this 
notion, the ratio of the absorbance of the chromophore to that of the aromatic ammo 
acids of histidine-tagged m-GFP purified from the soluble fraction of bacterial cells 
grown at either 25°C (Fig. X+4) or 37°C (Inouye & Tsuji. cited above) is 
approximately 0.4. Since this value is in excess of 1.0 for either native or acid-denarured 
GFP isolated directly from the Aequorea jellyfish, it would appear that more than half 
of the recombinant GFP in a soluble fraction does not have a chromophore. 

These observations suggest that extinction coefficients and quantum yields may be 
difficult to measure unambiguously for recombinant forms of GFP. Therefore, great care 
must be taken when interpreting the effects of mutations that alter the brightness of GFP. 
For example, a number of mutations in and near the chromophore of GFP have recently 
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been described that cause significant shifts in ihe excitation and/or emission spectra of 
the protein. A subset of these mutations that alter the tyrosine residue at position 66 to 
tryptophan, histidine or phenylalanine progressively blue-shift the excitation and emission 
spectra. However, these mutant proteins are much less fluorescent than GFP. a 
phenomenon which has been attributed to them having sub-optimal extinction coefficients 
and/or quantum yields due to the poor fit of the alternative amino acids into the central 
cavity normally occupied by the tyrosine residue. It is possible, however, that the 
obsen/ed low fluorescence of these mutants is due to detrimental effects of the 
substitutions on folding and/or chromophore formation, resulting in the presence of large 
amounts of non-fluorescent protein in soluble fractions. Therefore, it is feasible that the 
proper maturation of these mutants might be enhanced by the introduction of the amino 
acid substitutions present in m-GFPA or m-GFP5. Indeed, expression of a protein 
containing the Y66H mutation in combination with the substitutions present in m-GFPA 
in E. coli at 37 °C resulted in a 29-fold increase in fluorescence. The fluorescence 
spectra of this hybrid protein were also unchanged from those of the Y66H mutant alone 
(data not shown). Therefore, it is foreseeable that the substitutions present in m-GFPA 
and m-GFP5 may be combined with these and other pre-existing spectral mutations in the 
chromophore of GFP to produce a range of spectral variants with greatly improved 
maturation characteristics. 



As well as in E. coli, maturation of m-GFP appears to be thermosensitive in the yeast 
Saccharomyces cerevisiae and in mammalian cells. Therefore, it appears that the 
sensiriviry of apoprotein folding to temperature may be a ubiquitous phenomenon. 
Indeed, it is interesting to note that the brightness of m-GFP in Arabidopsis thaliana is 
markedly increased by its retention in the endoplamic reticulum (see accompanying 
paper), where a high concentration of chaperonins may enhance proper folding. It is 
unlikely, however, that the folding defect of m-GFP would manifest itself in the same 
way in systems where lower expression levels mean that aggregation may not occur to 
the same extent as in an E. coli overexpression system. Rather, it is more likely that 
partially or mis-folded apoprotein would, if it did not aggregate heavily, become a target 
of the cellular proteolytic machinery . Therefore, rather than becoming kinetically trapped 
by aggregation, improperly folded apoprotein would be depleted by degradation. In 
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support of this notion, accumulation in yeast of a GFP-nucleoplasmm fusion protein 
expressed from two gene cop.es steadily decreases with increasing incubation 
temperature. 

Moreover, if the sensitivity of m-GFP apoprotein folding to temperature is a ubiquitous 
phenomenon, then use of the thermogram mutants described here should result in 
improved expression in a wide range of experimental systems. Indeed, in this work the 
inventors have demonstrated that the substitutions present in m-GFPA are capable of 
suppressing the thermosensitivity of m-GFP expression in the yeast Saccharomyces 
cerevsiae. Expression of m-GFPA has also been observed to give nse to significantly 
increased fluorescence in Drosophtla melanogaster embryos incubated at 25 °C (A Brand, 
personal communication) and expression of m -GFP5 fused to endoplasmic reticulum 
retention signals has been observed to result in high levels of fluorescence in Arabidopsis 
thaliana (data not shown). Most strikingly, expression of both m-GFPA and m-GFP5 
has been found to result in greatly increased levels of fluorescence in mammalian cells. 
Therefore, we anticipate that the thermotolerant mutants described in this work and 
spectral variants derived from them will be of great benefit for express.on in many 
experimental systems, particularly those such as mammalian cells that utilise higher 
incubation temperacures. 
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SEQUENCE LISTING 



GENERAL INFORMATION; 

- 1 } -? C LICANT 

i A) NAME : M edic3l Researc.n Council 

i3; STREET. 20 Par* Crescent 

(C) CITY; London 

(E) COUNTRY: United Kinqdom 

•:F) POSTAL CODE (ZIP): wlN 4AL 

(G) TELEPHONE: (0171) 636 54?? 

(H) TELEFAX: (0171) 323 133i" 

7:T r 0F INVENTION: Improvements in or Relating to Gene 
Expression 

(ni) NUMBER OF SEOUENCES: 11 

(iv) COMPUTER READABLE FORM- 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(0) SOFTWARE : Patentln Release #1.0. Version #1 30 (fPO) 



(2) INFORMATION FOR SEQ ID NO: 1. 

.(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS. si nolo 

(D) TOPOLOGY: linear " 



(xi) SEQUENCE DESCRIPTION: SEO ID NO: 1: 
GGCGGATCCA AGGAGATATA ACAATGAGTA AAGGAGAAGA ACTTTTCACT 



(2) INFORMATION FOR SEO ID NO: 2: 

SEQUENCE CHARACTERISTICS- 
(A) LENGTH 33 base pairs 
(3) TYPE: nucleic acic 

stpandedhess smcie 
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:Z; INFORMATION FOR SEO ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH ■ 122 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5: Single 
(0) TOPOLOGY : linear 

(xi} SEQUENCE DESCRIPTION. SEQ 10 NO: 3: 
GATCATATGA AGCGGCACGA CTTCTTCAAG AGCGCCATGC CTGAGGGATA CGTGCAGGAG 
AGGACCATCT TCTTCAAGGA CGACGGGAAC TACAAGACAC GTGCTGAA.GT CAAGTTTGAG 



60 
120 
122 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i ) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS. single 

(D) TOPOLOGY: linear 

i'xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 
GATGTATACG TTGTGGGAGT TGTAGTTGTA TTCCAACTTG TGGCCGAGGA TGTTTCCGTC 60 
CTCCTTGAAA TCGATTCCCT TAAGCTCGAT CCTGTTGACG AGGGTGTCTC CCTCAAACTT 120 
GACTTC 

126 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

'*'•) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GGCGGATCCA AGGAGATATA ACAATGAGTA AAGGAGAAGA ACTTTTCACT 50 

■'I. -'FORMATION "QR SEO ID MO: 6: 

SEQUENCE CHARACTERISTICS: 
yZl LENGTH . 33 pase pair? 
(5) TvpE: nucleic acia 
•'C) STRANOEDNESS . single 
'0: TC-OLQG:'- '. "near 
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C- NO: 



• > i 1 CC.-7 GCC 
''2; INFORMATION FOR SEj ;j NO- " 

' ]) SEQUENCE CHARACTERiSTICS 

1 ?HE; nuc ieir ar 1C j 
oTRANOEDNESn- <nncl~ 
<D> TOPOLOGY; linear 

Ul) StOUEMCE OESCRIPHON: SEO ID NO- 7 ■ 

TAGTGGTGGT GGTGGTGGTG TTTGTATAGT TCATCCATGC C 
(2) INFORMATION FOR SEO 10 NO: 8- 

f i SEQUENCE CHARACTERISTICS 
(*) LENGTH: 60 base Da7r< 
B) TYPE: nucle?c ac ?d 
(t; SiRANOEONESS- sinniP 
(D) TOPOLOGY . b near 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 8- 
GGCGGATCCA AGGAGATATA ACAA.TGAAGA CTAATCTTTT TCTCTTTCTC ATCTTTTCAC 
(2) INFORMATION FOR SEQ ID NO: 9: 

^) SEQUENCE CHARACTERISTICS - 
A LENGTH: 50 base pairs 
B; TfPE.- nucleic acid 
. C) STRANDEDNESS: s?ngl e 
(0; iOPOLOGY: linear 

(X1) SE0UE ^ DESCRIPTION: SEQ ID NO: 9: 
-GAATTCG GCCGAGGATA ATGATAGGAG AAGTGAAAAG A7GAGAAAGA 



60 



'c) INFORMATION "OR SEC i 



10 H~ 
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v XI 



) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



-TG 

M 9t 



Lvs 



ACT AAT CTT TTT CTC T, , , , ., .. 

Thr Asr leu Phe Leu Phe Leu Tie Phe Ser i * u 
5 10 



ITC ATC TTT TCA CTT CTC CTA TCA 



.eu 



UIL 

val 



"CT 
ier 



CTT 
Leu 
65 

CTT 
Leu 



GAT 
Asp 



TAC 
Tyr 



AC A 
Thr 



vsAG 
Glu 
145 

-AG 
Lvs 



tc: 

Se' 



CCA 



Vol 

50 

AAA 
Lvs 



.eu Leu Ser 
1 = 

^G GCC GAA TTC AGT AAA GGA GAA GAA C~ TTC <ir T r~' p-t 
Ser A,a Glu Phe Ser Lys GU Glu §* Leu P^ ftr ^ g| 
^ J ^ 3Q ' 

ill lS vll ff ■ J? - AT GGT GAT GTT AAT GGG CAC AAA TTT 
lie L-j Val G,u Leu A.so uly Asp Val Asn Gly His Lys Phe 

3 40 -45 

Ser" GW Hu n 7 rf r G f S AT G T A ACA TAC GGA AAA CTT ACC 
Ser Gly Giu Gly Glu Gly Asp Ala Thr Tvr Giv Lys Leu Thr 

52 60 J 



GTC 
Val 



CAT 

His 



G iG 
Val 



CGT 
Arg 
130 

C~ 

Leu 



Leu 



Phi III Jvs ThJ Tnr ^ P A C n CT GTT CCA TGG CGA ACA 

nne ne Lys Thr Tnr Gly Lys Leu Pro Val Pro Tr P Pro Thr 
/0 75 so 

^ T ^u T JJ C TCT TAT GGT Grr CAA TGC TTT TCA AGA TAf CCA 

Thr Thr Phe Ser Tyr Gly Val Gin Cys Phe Ser Arg jfr PrJ 
Hb 90 95 

Sit fj? a™ m AC f C S C MG AGC GCC ATG CCT GAG GGA 

Met Lys Arg His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly 

iUU 1Q 5 110 

CAG GAG AGG ACC ATC TTC TTC AAG GAC GAC GGG AAC TAC AAG 

G n Glu Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys 

UD 120 12 5 

All S §Ii ^ 27 GGA GAC ACC CTC GTC AAC AGG ATC 

Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He 

135 140 

AAG GGA ATC GAT TTC AAG GAG GAC GGA AAC ATC CTC GGC CAC 

^ys G.y ile Asp Phe Lys Glu Asp Gly Asn He Leu G?y His 
150 155 160 

'GAA TAC AAC TAC AAC TCC CAC AAC GTA. TAC ATC ATG GCC GAC 

G,u ivr A.sn ,yr Asn Ser His Asn Val Tyr He Met Ala Asp 
io5 170 i7 5 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 
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ATT GGC GAT GGC CCT GTI CTT TTA CCA. GAC AAC CAT C"- jrr aca 

ne uly -.so .jly ;- 0 Val L=u L°'j p-o As- Acn m\c - ~ ^ c - 6/ 

220 

CAA. TC" GCC Z~ T 
11 n Se r Al, 5 Leu $■ 
■25 230 



' ° AAA. GAi CCC AAC GAA A Am " • ~ -■\~ .<t~ 

L - J " r ~' S P ■ r ^ '-i j lvs Arg Asd his Me: val ■• 

235 240 

CTT CTT GAG TT GTA ACA GC T GCT GGG ^TT AfA r* T v~ a— 

Leu Leu Glu ?n* Val Thr a^ a-,, ,rw V I tk G ^ Aij g " t bAA ''68 

- /oi in, m :u n ,a ui/ i:e Thr His Glv M»t A^p Glu 
24:3 250 ' 255 

CTA TAC AAA. CAC GAC GAA CTC TAA 

Leu Tyr'Lys his ASD Gij L°u ' /92 ' 

260 

(2) INFORMATION FOR SEG ID NO: 11: 

(1 ) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 
(0) TOPOLOGY: linear 

(11) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SE0 ID NO: 11: 

Met Lys Thr Asn Leu Phe Leu Phe Leu He Phe Ser L°u Leu L°u Spr 

10 15 " 

Leu Ser Ser Ala Glu Phe Ser Lys Gly Glu Glu Leu ?he Tnr Gly Val 



20 25 



30 



Val Pro lie Leu Val Glu Leu Asp Gly Asp Val Asn G'.y His Lys Phe 

° 3 40 45 

Ser Val Ser Gly Glu Gly Glu Gly Asp Ala Thr Tvr Gly Lv< L^u Th- 

DU S3 60 ' ' 

Leu Lys Phe lie Cys Thr Thr Gly Lys Leu Pro Val o ro Trp Pro Thr 

,u '5 80 
_eu Val Thr ~~,r 2*° s= 



be r :yr Gly Val Gin Cvs Phe w 



25 90 at; 



\rg ivr Pr 
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Lys Leu Glu Tyr A.sn Tyr Asn Ser His Asn Val Tvr He Met Ala Asp 

ibD 175 
Lys Gin Lys Asn Gly Il e L ys Ala Asn Phe Lys Thr Arg H-,s Asn lie 

i0U 185 \QQ 

Glu Asp Gly Gly Val Gin Leu Ala Asp His Tyr Gin Gin Asn Tr,r Pro 
L3 200 205 

He Glv Asp Gly Pro Val Leu Leu Pro Asp Asn His Tvr Leu Ser Thr 

2i5 220 ' 

Gin Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp H 1S Met Val 

230 235 240 

Leu Leu Glu Phe Val Thr Ala Ala Gly He Thr His Gly Met Asp Glu 
245 250 255 

Leu Tyr Lys His Asp Glu Leu 
260 
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Claims 

1 . A DNA sequence encoding Green Fluorescent Protein (GFP). the sequence being 
modified relative to the wild type sequence so as to allow for more efficient., 
expression in a plant cell of a functional GFP polypeptide. 

2. A DNA sequence according to claim 1. wherein the modificauon is such as to 
reduce the probability of an RNA sequence transcribed therefrom being subject to 
erroneous splicing in a plant cell. 

3. A DNA sequence according to claim 2. comprising a plurality of nucleotide 
substitutions relative to the wild type sequence, the substitutions serving to reduce the 
excision from the transcribed RNA of the portion corresponding to nucleotides 400- 
483 of the DNA sequence. 

4. A DNA sequence according to claim 2 or 3. wherein the modification serves to 
decrease the A/U content of the transcribed RNA. 

5. A DNA sequence according to any one of claims 2, 3 or 4, wherein the 
modification serves to decrease the A/U content of the portion of the transcribed RNA 
corresponding to nucleotides 400-483 of the DNA sequence. 

6. A DNA sequence according to any one of the preceding claims, modified so as to 
cause an amino acid substitution at residue 163 and/or residue 175 relative to the wild 
type protein sequence. 

A DNA sequence according to any one of claims 1-6. further comprising a 
cellular localisation signal directing the encoded GFP to a particular cellular 
companment. 



8. A DNA sequence according to claim 7. wherein the encoded GFP is directed 
the endoplasmic reticulum (ER). 
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9. A DNA sequence according to claim 8. comprising the Arabidopsis thaliana basic 
chitinase localisation signal. 

10. An RNA sequence capable of being transcribed from a DNA sequence according 
to any one of claims 1-9. 

11. A modified GFP polypeptide compnsing an ammo acid substitution, relative to 
the wild type protein sequence, at residue 163 and/or 175: 

12. A modified GFP according to claim 11, wherein residue 163 is alanine or a 
related amino acid. 



13. A modified GFP according to claim 11 or 12, wherein residue 175 is glycine or a 
related amino acid. 



14. A modified GFP according to any one of claims 11, 12 or 13, comprising 
more further ammo acid substitutions relative to the wild type protein sequence. 



one or 



15. A modified GFP according to any one of claims 11-14, comprising one or more 
further amino acid substitutions in. or immediately adjacent to. residues 65-67. 

16. A modified GFP according to any one of claims 11-15. comprising a localisation 
signal. 

17. A modified GFP according to claim 16. compnsing a signal directing the GFP to 
the endoplasmic reticulum. 

18. A modified GFP according to claim 17. comprising the Arabidopsis thaliana 
basic chitinase localisation signal. 

19. A nucleic acid sequence encoding a polypeptide according to any one of claims 
11-19. 



0 Q PCT/CB96/00-I81 

- u - A nucie,c acid construct co m 

llc any one of 

:3 - Arne ^ofde t e Ctmetheexn . 

Ara ^„ fscreenin „ 

^ *e s= auence of ,„ ' C ° mPnSra S "Uxu* a con,,™, ' ^ ' 

• - *• ~ of raodm : d n ;; * a suffiCKnt length 




H 0 0) 

U CD 

it) Di U 

CQ — cn o 



SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB96/00481 




SUBSTITUTE SHEET (RULE 26) 



PCT/GB96/00481 



''21 



CM 



2 



o 
to 

CO 




SUBS7TTIJTF <:wcct/di ii c nc\ 



PCT/GB96/00-481 



A C G T 



A 5' 

G 
A 
A 
U 

C " 
G 
A 
G 
U 
U 
A 
A 
A 
A 
G 
A 

c 

A 
A 
A 
C 
A 
A 
A 
A 

G 
A 
A 

u 3' 



Fig 3A 



5' splice site 3 - sp | ice site 

} 84 base intron | 

. AG GUAUUGA . . . UCAUGGCAG AC . . . 

GFP sequence 

5' splice site 3 - sp | ice site 

| A:U rich intron 1 
. AG GUAAGU . AG G 

plant consensus 

Fig. 3B 

SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 ^ 



PCT/GB96/00481 



5/ 



21 



01 01 
01 01 
J 0 K 

.-a a 

ro nj 

u jj 

> JJ 'jj. 

*- <«■■.•«*• 

JJ JJ 

(0 ro 

Oi Oi Q 

CO IB 

y u 

y o a 

y y 

(13 (0 

JJ JJ >H 

to ro 

Oi oi 

(0 (0 as 

fO (0 

r- ij 4j oo 



rs 



0 (J 

O) Oi 

•J u O 

(0 nj 

(0 «} 

0 u a 

JJ JJ 

-J JJ 

01 Oi > 



Oi 01 
O) C71 o 



•"0 (T3 
-J .J 



u y 



co 



'J 0 

-J <J 

— -I 

o 0 

CO (TJ 



y 0 

o .-: (0 E- 

1—1 'J u 

O .j ij 

^ Oi 01 > 



CN 



y 
JJ 



JJ 
JJ 
JJ 

(0 

Jj 
(0 

■U 

y 

0 



(0 

(0 



0 
JJ 

JJ 

y 

jj 



0 
to 

u 

o 

(0 



0^ & 

cO Ci 

Oi 



Ol 
CO 

y 



Ol 
(0 
U 



03 ^ 



JJ _ 

2 & 



Ol 



Oi 



2 & 



y 
y 
o 



t31 D» 

JJ jj 

<0 (0 S 

0 u 
^ U 

01 C7I < 

JJ n 

Oi 0) 

Ol Oi 

to ro 

<0 <0 X 



u 
JJ 

JJ 

JJ 
Jj 



y 
JJ 
JJ 

0 

JJ 
JJ 



V u 

o Oi 0) Q 

n J U 

«3 15 HJ 

<N U CJ £ 





JJ 




JJ 




f-. 
w 




JJ 




JJ 








0 




r v 




53 




■u 




frt 




*J 




JJ 




>> 

O 




Ol 




rt3 




(0 




Ol 




JJ 




JJ 




JJ 




0» 




«J 




(0 


rH 


y 


rH 


JJ 


rH 

■v. 


Oi 


rH 


(0 


tn 


1*3 


rn 


0) 




JJ 




u 




Ot 




JJ 




0) 




y 




(0 




y 




(0 




0) 




10 




ro 




0 




03 








u 




(0 




<0 




Ol 




Ol 




0) 




y 




10 




Ol 


i— i 


jj 


o 


10 


rH 
\ 


01 


r-t 


10 




10 


ro 


<0 



0 

s> 

o 

jj , 
y -• 

y 
o 



Ol Ol o 



<0 



JJ 
JJ 
JJ 



y 



jj 
u 



jj 



y a 



jj >< 





rq 




(0 




(0 




y 




(0 




y 




(0 




0 




0 




jj 




jj 




y 




jj 




jj 




(0 




y 




fO 




(0 




(0 




0 




0 




Jj 




R3 




0 


rH 


(0 


m 


<0 


rH 

V* 


0 


r-l 


(0 


r-1 


(0 


"V 


ro 




jj 




jj 




JJ 




jj 




10 




0 




jj 




jj 




«3 




JJ 




0 




0 




(5 




ro 




ro 




(0 




jj 




jj 




O 




(0 




0 




y 




jj 




ro 


rH 


<0 


CN 


O 


r-l 


<0| 



& 

rrj x 
ro 

y 

ro x 
y 

y 

Ol u 
Ot 

o 

y 



s 

(0 

y 
ro 

(0 

(0 
Oi 
0) 

y 

(0 
Oi 



to 

Ol 

& 

ro 
rO 



W 



0 

JJ fa 
JJ 

S.Q 

o 

jj 

(0 H 

id 

go 

ro 

ro « 

<y 

jj 

y j 
& 

(0 

Ol u 

y 
jj 

ro h 

& 

Oi 



CO 



JJ U 

<0 ro 
101 ro 



to ro 
Oi Oi 

oi oi o 





4J 


jj 






(0 


ro 






(0 


(0 


2 




01 O) 






rfl 


(0 






(0 


(0 






ro 


to 






(TJ 








y 


y 






(0 


(0 






tO 


(0 






(0 


to 






y 


y 






<o 


(0 






0 


Ol 


Q 




(0 


ro 






y 


y 






0 


Oi 


< 




0 


Ol 






jj 


JJ 






(0 


(0 




r-l 


y 


y 




m 


jj 


jj 




r-l 


to 


(0 


r-t 


\ 






r-l 


y 


y 






ro 






■<? 


jj 


•JJ 





0 . jj 

jj 

(0 
(0 

y 

(0 

y 

<0 
y 
Jj 

u 

ro 
(0 

jj 

(0 
J> 

u 
ro 
(0 

y 

(0 

jj 

<-i ro 

ro 

rH 0| 

m a 
JJ 

■V JJ 



u 

(0 

(0 z 

y 

(0 

y s 



CO 



0 
u 
JJ 

y 

ra 
ro 

O 

(0 
jj 

y 

(0 

to 
y 

(0 
jj 



ro 
(0 

Ol U 

O) 
jj 

JJ l-H 



T3 
<D 

i 

a 

w 
k- 

a. a: "a 

§ .i 

11 '-g E 

Q c ro 
o -a 

§ s s 

s w •= 

W o o 

I'll 

^ o Z 
o_-^ 

IL 



2 co 

is 

o c 
c 

3 



T3 

3 P 



o 
1-2 

O «3 



O 

O" Q) 



< 



ro 



0) 
u 

CO c 

c - 

cd cr 

O" w 



" ® C 

u 



c 
c 

Q. ^. 
C C 



CO 



CO 
w 

CO 

t/5 

0) 

"C 



CO 

£ 
cr. 



c ^ 
■c 

CD 



0) 



£ £ 2 c 



SUBSTTRJTE SHEET (RULE 26) 




SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB96/00481 



''21 



G 
> 



ro 

go — 

^ a. 



OS 
sO — 



ro 
X 



ro 



Osl 

o 



QJ 

-o 



Z 



£ 
ro 

CD 



ro 
cu 

\A 
IV 
-O 

m 



CL 

O 
E 





■ i 


1^. 




' i : 


CJ 




' ' 


























G") 




—J 






G 


on 






r— : 






G 




G 


i ; 




cn 


G") 






QJ 




+-> 






r\ 






U 


c 




G 


V) 










U 


on 






• i— 1 


Q. 


OJ 


< 








v_J 


— 


J J 


C 


gS 


SL 




G 


G 




—j 


0) 










G 


• i—" 












>^ 




*w 


J 




OJ 






1 


















G 


I/) 


















> 






on 
















l 










IT), 







LL 
O 



Z 










> 


V 1 , 






on 




>\ 


Gl 




Gn 


GD 




C_ 


*o 


oo 


GT) 


G 




i 






G1 


'Si 


, , 


ou 




1 — 1 


G 


•I—" 






G 


on 


G 


G 


OJ 


on 






U 


r~ 


_ 


G") 


G1 




G 


C 


— ' 


CJ 






G 


^Z 






G 




~w 






CJ 




r— 




> — V 






(Itl 


(IS 




a 


U 


r— : 


on 


G 




on 




> 







- — I 



— ' o 
GT; > 

U >s 
Glf- 1 

TO CD 

G ^ 

G* 1 1 . — 

Gn G) 
g on 

GT) C 

G G 
G r-J 

Gn GD 

4-J CJ 



U 
G 
G 



on 



< ^ 

a. 

LL 

o 



oj on 



G Gn 
Gn i_ 

G G 



QJ 



c 



on 



U CJ 



w on 

G 

Gn g 
G 

G >> 



CTi 



<s 

m g •- 



3 — — • 



SUBSTTTUTE SHEET (RULE 26) 




(siino Aijvanaav) 93N3DS3aomj iMivm 



SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB9C/00481 



9/ 



21 



5, 



A' <? 4" 



< 42.7 kDa 



< 30.0 kDa 



< 17.2 kDa 



Fig. 8 



SUBSTITUTE SHEET (RULE 26) 




SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 




PCT/GB96/0048I 



11 / 2 1 




(pasi|ELUJou) aoueosajonij 



SUBSTITUTE SHEET (RULE 26) 



PCT/GB06/00481 




SUBSTITUTE SHEET (RULE 26) 




aoueqjosqe 

SUBSTITUTE SHEET (RULE 26) 




SUBSTITUTE SHEET (RULE 26) 



WO 96/2767* 



PO7GB96/00481 



21 




(Q3SnvwaON) 30N3:)S3dQnid 



SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 




PCT/GB96/00-481 



16 



QJ 
CU 



Cn 
Di i 



Cn 
Ui • 
5 



^ Oi 
Cn ' 
6 



"1 

a, 

^ Cn 
Cn ' 
6 



U~l 



in 

^ to 
Cn ' 
6 



^ Cr. 
E 



00 
LD 

cn 



u u 

iu <o 

• Oi 0> D 

u 1_) 



<0 <0 

it) (0 

01 01 u 

u a 

u u 

Oi oi > 

u u 

u u 

U U -3 



HJ c(3 >— • 

iO <u 

u u 

U <J o- 

u u 

u u 

0< U> > 

J u 

u u 

Oi Ol > 

iO IT) 

Oi Oi 

01 Oi o 

u u 

u u 

'J u 

U J ll. 

w u 

U 1-1 

u u -J 

<v m 

O' Oi W 

"I <0 

i\) <0 

<J> 0< ^ 

•1 

Oi 'Ji 

'Ji Ci o 

«j nj 

01 13 

•<J ■« - 



U' Oi 

•o i-j i.': 

'Ji Ui 

-i u 

'U iO ^ 



u u 

(0 HJ >. 

u iJ 

iO <U 

U U h 

iO <u 

<0 <o 

u u < 

Ol 01 

u I j 

« <l Q 

0> Cn 

a u 

Ol Ol (J 

0) 01 

(0 iO 

<0 «J U 

o> cn 

U 1-1 

Ol Ui O 

Oi Oi 

Oi Oi 

10 10 u 

O) 01 

10 iO 

Oi Oi o 

Ol Oi 

U u 

O) Oi 10 

10 10 

u u 

u u > 

Oi Oi 

IJ u 

U U 01 



u u 

u u U. 

u i_i 

fU 10 

10 10 X 

io <o 

0 u 

HI m I 

u u 

Ui Oi 

cn m c 

01 Oi 



1^2 

<0 <0 

u i_i 

U i-i > 

Ol Ol 



U 10 C 
Oi Oi 



Cn Oi O 
Oi Oi 



u 

U 



10 



iO 



Oi Oi u 



u u 
(0 io 



u u U. 



10 id 

<J « I 

<0 iO 

u u 

IJ u _J 

U (J 

u u 

u o t- 

10 (0 



u u 



(0 



10 
'0 
<0 10 



(0 io 
Ol Oi o 
Oi Oi 



10 


"0 




u 


1 1 c 




(0 


(0 




10 


<0 




u 


U 0. 




u 


u 












01 


01 




Mi 


Oi 


Di 3 






u 


u 




o 








u 


OJ 


<0 






u 


u a 






u 


u 




u 


u 




u 


u > 




Ui 


Ol 




u 


i_i 




o 


u a. 




u 


o 




10 


iO 




u 


ij j 




u 


u 




10 


10 










U 


<0 




«J 


10 




0) 


Ci o 




01 


Ol 





<a oi 

io io u; 

iO 10 

Oi 01 

u u X; 

<0 (0 



10 (0 X 

u u 

u u 

10 (0 Q 

01 Oi 

10 iO 

u u a. 

0 u 

u u 

u u 

10 (0 

01 oi a: 
io <o 

ID 10 

u u (/) 



u u 

u u 

Ol 01 U 



<0 10 
rtJ 10 O 

u u 



u a > 

Oi Oi 

IJ iJ 

Oi 01 o 

Oi Oi 

U U 

>0 (U >• 



U I-I 



o u 

tJ u Lu 



(J u 

U (J £- 

<0 iO 

w u 

o u e- 

U U 

u u > 

cn oi 



U U J 

u o 



u u U. 
u u 

(0 O 

i-> 1-1 M 

<0 io 

u o 
U U E-i 
10 10 

*J Oi 

cn ui o: 
aj io 

<T3 Ol 
HJ ro u 
Oi Oi 

Ol Ol 
<0 10 O 

u u 

iO Ol 
U 1-1 > 

cn ui 

i-i 0 

01 ITJ >- 

u a 

Oi Oi (J 
Ol Oi 

>0 0) 

iq ia u 
Cn oi 

U iJ 

u u a 

0 u 

01 Ol 

a u x 

iO io 

u u 
u u < 
Ol oi 

u u 

Oi Oi (/) 
iO (0 

Oi Ui 

<0 ^ 

iO 10 

(J o 
u u U. 
u U 

o U 

u u 

0 u 

iO 10 Q 
O. 01 

U 0 
<0 (0 X 

u u 

01 Ol 

oi oi a 
u u 



u u 

u u J 

U U 

U U 

'J U 8- 

10 (0 

i-i u 

iO iO Q 

Ol Oi 

U 10 

Oi Cn CJ 

01 Ol 

iO Ol 

(0 (0 W 

Ol Ol 

U u 

u u U. 



Cn cn 

v iO x. 

io nj 

u u 

u u .> 

UI Oi 

(0 <0 

10 (0 UI 

Ol 01 

u u 

U U < 

01 Ol 



Ol Ol IX 

u u 



<0 10 s 

o u 



UI Oi 

'0 (0 X 

10 (0 

u u 

<0 iO >• 

1-1 u 

U <J 

■a <n zr. 

'0 <u 

Ui 0' 

Ol Oi o 

Oi Oi 

u u 

>0 (U c 

Ui Ui 

i-i U 

10 iO c 

Oi Oi 

10 Ol 

(0 <0 X 

(0 (U 

u u 

u u u. 



c 

0 



U ID 
U (J 

U ID 

o 
u 



O 

Cn (J 

o> 
o 
u 
u 

u >-. 
10 

u 

<U 2 

to 

(0 

Oi 'O 

cn 
o 

<o c 

Oi 
Oi 

<0 uj 
0' 

Cn 

io x 

10 
O 



u 

10 c 

cn 
u 

1—1 ■— 

iO 

o 



— Of Oi o 

Ui cn 

iD Oi 

HI il i. 

iD 



C« Oi 

iO UJ UJ 

Oi u. 

!-> 'J 

■o ic 

'0 Oi 

C> Oi u. 

iJ UJ 

i-i u 

<0 <0 2 

<y io 

IJ (J 

u u > 

Ol Oi 



SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB96/00481 



l7/ 2 . 



Osl 

~QJ 
QJ 



U"l 

Cn ' 
S 



a 
a ^ 

^ CJi 
Oi ' 
6 



^ Ch 
Oi ' 

e 



V| 

Ui ' 



^ Ol 
Oi ■ 
5 



00 



u u 

10 IU 2 

IU (0 

Oi Oi 

*o io u: 

TJ Ol 

«3 <0 O 

U U 

«J Oi 

<o io u: 

iU nj 



it; u 
U O < 
Oi oi 



01 



u u 



O) 
<u| 10 
u 
<u 



u 

u < 

01 



<T3 

w > 

Oi 

u 

•0 2 



o 


u 


Uj 


4J 


u 




<~ 


0 




'0 


13 


2 


nj 


<U 




t_> 


u 




■0 




>• 


u 


u 






0 




10 


<0 


2 


<0 


<o 




o 


u 




IU 


IU 


>• 




u 




fC 


nj 




'U 


<y 


u 


0 


Oi 




0 


Oi 




4-< 




*-4 


4-1 


u 




fO 


D> 




M 


10 




■0 


"3 





HJ 



u u 



u 



«3 



u u > 

Oi Oi 

u u 

W Ol Ol O 
aj o< 

10 U 
Ol Oi u 
Ol OI 

u o 

D HI D 

01 Oi 

<0 iQ Ul 
01 0) 

u u 

4J 4-1 t-i 

<u ia 

u u 

iu nj 2 

iu 10 

u u 

nj m x 

0 u 

<U o 

01 o> cc 
IU o 

4-> U 

m u u H 

iU iu 

•U Ol 

m m k 

iU cU 

u o 

4-1 4-1 U, 

IJ .-I 

u u 

<fl IU 2 

IU IU 

u u 

> u U < 

Ul Oi 

03 lU 

o u 

J U H 

0 iU 

IU (J 

01 Oi u 
Ol Ol 



4—1 


i_l 






10 


X 




u 




u 


(J 




it! 


10 


2 


iU 


•0 




u 


u 




ID 


10 


a 


Oi 


Oi 




'0 


IU 




u 


u 


a. 


u 


u 




iU 


<0 




u 


u 




i-J 


u 




4J 


u 




u 


a 


■J 


u 


u 




(J 


u 




u 


u 


> 


Ol 


Oi 






u 




u 


u 


a, 


u 


u 




(J 


u 




Ol 


Oi o 


Oi 


Oi 






u 




10 


IU 


a 


cn 


Oi 




u 


u 






Oi u 


Ol 


Oi 




J J 


1-1 




u 


u 


►-I 


■0 


iU 




iU 


■0 




u 


u 


a 


u 


u 




u 






u 


u 


E- 


'U 


IU 




tj 


LJ 




iU 


10 


2 


iU 


10 




>u 


10 






iU 


a 


u 


u 




'0 


i0 




10 


10 


o 


u 


u 




LJ 


u 




'0 


iU 


>' 




4-1 




u 






10 


i0 




u 


u 




u 


4-1 




IU 


iU 


a 


Oi 


cn 





u u 

u u 

Ol Ol > 

oi cr» 

U 4J 

IU IU z 

0 u 

10 (0 

U O I 

U U 

iU 10 

01 Ol Q 

10 iU 

Ol Oi 

<U 10 LX 

Oi Oi 

10 i0 

10 10 * 

i0 10 

nj iU 

Oi oi w 

10 iU 

<U i0 2 

u u 

u u 

u ua 



IU IU 

Ol 01 Q 

IU 10 

IU IU 

IU iU X. 

Ol Oi 

u u 

U u l/l 

IJ u 

IJ u 

U (J _] 

o u 

0 u 

01 o> < 

U I J 

u u 

>J l J Ui 

'0 io 

ifl 10 

u o o 

iU iU 

U U 

iU iU t- 

U U 

U 1-1 bl 

Oi o> 

I J LJ 

o u u 

o O 

W tJ >^ 





iU 




(0 • 




4J 


in 


■0 


iU 


■0 X 


IU 


■0 


u 


(J 


IU 


10 > 




u 


iU 


IU 


u 


4-1 -J 


u 


u 


10 


10 


10 


(0 UJ 


Ol 


Oi 


u 


4J 


■0 


■U L— 1 


0i 


01 


Ol 


Oi 




i i T* 


10 


u 


u 


u 






Oi 


D» 


IJ 


u 


'0 


a\ x 


u 


u 


iU 


10 




u 


<u 


10 


4J 


4J 


u 


4J H-l 


iU 


IU 


Ol 


Oi 


Ol 


Oi o 


0i 


Oi 


4-1 


4J. 


u 


u < 


Oi 


Ui 


4J 


w 


u 


u < 


Ol 


Oi 


*u 


'0 


•J 


O 6- 


■u 


■0 


iU 


iU 


u 


4-1 > 


Ul 


Ui 


4J 


u 


4J 


U Ll. 


4J 


4J 


Ol 


Ol 


iU 


iu u: 


Ui 


Ui 


.J 


*j 




>-> — 


u 


Li 


>J 


4J 


*-/ 




u 


U 



SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB96/00481 



■J 


< 




ti 




( • 

rc 

w >- 




CJ 

■"2 E— 












CJ 

cj 


















^-i 
















u 






— j 


























CO 




-j ^ 




a E— 




CJ C~ 




4-J 




4-1 

ro i— 




CJ 

03 c- 






co 




tj 
tj 

cp ui 




<TJ 
CJ 

CP < 




C' 
CP 

- 2 




rc 




CJ 

■J 

'TJ e- 




{ \ 

CC 


cu 


< 
< 


— 




cp > 
-j 




T! 

CP £; 

cp 




T; 
J 

«— ' 




cp c 
a: 




CP 

TJ CC 
CP 




CP 

CP u 

CP 


aj 




co 




CJ 




CP CO 




CP > 








TJ 

cp u; 




TJ 

CP UJ 




< 










a: 












CP 
TJ 

o o 




E- 

r « 














u 




<D 












— 




it; 




cp u: 




cj a- 




— >■ 






— u. 




CTC 






t> 

CJ 

CJ ^- 




cr 

CP cj 




TJ 
_J 




ID 
CP 

ITJ ^ 




CP 

4-1 

CP > 




TJ 

TJ iC 




c— 
H 

U 






u 

cp > 




CP 

It) 

CP u 




TJ 
TJ 
<TJ 




(TJ 
U 

-J CO 




u 

TJ 

4-1 >H 




U 

4-1 

CP> 


LL_ 


< 

cj 


CO 


i— 1 
\ 


4_i 

4-J 

W' 


■ — i 

\ 


m 

CP 

O" 1 \Sl 


r— ' 


T) 
CP 

CP CJ 




4-1 

4_) 

4-1 t^, 


r— 4 


m 

»u 

CP 

rp. f n 


CO 


TJ 
TJ 

CP Ct] 




TTT 




r-H 
f— 4 


ro 
CP 

cp o 


<— 1 


i • 

CP 

TJ CO 


< — 1 

n 

CNJ 


(j 

TJ E- 


\ 


u 

- u 


lT) 


CP 
TJ 

o 1 u: 


•\ 


4-J 

CJ 




ATC 






4_i 

u 




U 

4-1 

CP > 




-J 

u 

(TJ E— 




>T3 

trj 

U O 




4—1 

u 




4-1 

CP 

U CC 




L> 






U 

4-J 

— ' Ll_ 




u 

-*-) CO 




CJ 
CP 




4-1 

CP > 




CP 

4-1 




TJ 
(j 

TJ E- 










*— J 




-J 




■i-i 








u 




CP 






r. 




u — 








4—' 

."3 i— 




CP 

CP CJ 








TJ 

•TJ X 




CJ 


— 




TJ 

TJ 

cn a: 




•T) 
TJ 

<t) 




— J 




•T3 

— >- 




CJ 
CP 

■t: 'jo 




T! 

— ; >- 














o 


























fj 




TS 
















CJ 






Cu 




CP CjJ 








•TJ X 








•TJ 

TJ -JL 












































CP 




cr 


























CP C 




CP (J 








— u. 








cr cj 










ti 
























< 






ti 




(TJ 




'J 












CJ 




< 






- i>c 




T) 2 




■TJ E- 




•TJ E~ 




_l 

— J Cu, 




■Ti 

CP- 2 














4_; 




— j 




4-J 












< 


E— 




CP 

TJ CO 




4-1 

CP > 








fj 

•TJ E— 




(T! 

CP C 




CJ 
(TJ 

CP c 


















<z 
















< 


























CP 




< 


L£ 




r— It- 


\ 


it; 




re 


\ 


CP > 


o 






TS 

■t; x 




< 


2 


\ 

00 


< 
< 




cr 

cr CJ 




TJ- 

CP C3 




4-< 


T\J 
ro 


CP 


\ 
CC 

m 







SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB96/00481 



V21 



if) 



CJ 
A3 

u = 

03 

cp 

CP (J 

CJ 

cj ~ 
cj 

03 M 
U 

03 2 

03 

cp 

cp co 

u 
oj 

cp Q 

a> 
03 

a> — , 

fT3 ^ t— h 

U -H 

■u ro 

iJ t in 



03 



03 

03 2 

CP 
03 
03 

CP 

03 

Cj O 

CP 
03 

03 X. 

•U 
03 

cn Q 

cj 
u 

CP < 
cp 

4-1 

03 Z 

U 
4-> 

03 I— i 

U r-, 

03 CP, 

4-1 >~ ,-H 

03 <— i 
CP > iD 



03 t— > 

03 

CO 

cp 

03 
03 



CJ J 

CP 
03 



03 i— 



CP 

03 e: 



03 

(J 3Z 



00 



03 

03 2 

U 
03 

•i-J >- 



CJ 

CP < 

u 

—J 

u - 

03 
■03 

- O 
CP 

CP > 

u 

Cn 

CP CO 

CJ 
CP 

CP CO 

u 

03 

cp c 

03 
03 

CP u 

cj -> 

■u M 
03 — CN 
S 

CJ rl 

oj m 

03 2 

cj 

03 

CJ I 

cj 

CP 

CJ DC 

CJ 
CJ 

03 E- 

CP 
03 
03 

'J 



03 



03 

CP C 



03 
CJ 



4-> J 
—I 

U J 

CJ 

CP> 



u 

cj a 

CJ r-l 

cp n 

CP CO c\j 



CP> 
CP 

4-1 

^ z: 



CJ x 

CJ 
03 

CP C 

03 
CP 

03 (X 

CP 
03 

03 X 

03 
03 

cp m 

CJ 
03 

03 2 

CJ _H 

CJ if) 

u a, <m 



u 

03 

u =: 

03 
03 
03 

CJ 
03 

J-> >• 

03 

CJ J 

0) 
03 

CP CiJ 



03 

CP Q 
CP 

4J 

03 Z 

U 

CP CO 



03 

CP Q (— 

CJ 
CP 

CP CO 



03 ! — 

Cp Q r~ 

03 
03 

03 X. 



03 >— * 

03 
U 

CJ Cb 



03 E— 



CP 
U 



c 

03 

CP — 
_ CD 



03 

03 2 



CJ 

CP < 

03 
03 

03 \C 



03 

03 2 

03 
03 

CJ O 

03 
03 

CJ O 



-J >- 



01 



U J 

U 
CJ 

CP < 



03 
03 

cj a 

03 

03 £- 

CJ 
U 

CP •— H 

C; — cm 



03 

U I 
03 

u 

03 E- 



03 i— 

CP 
CP 

CP CO 



CJ 

CP < 



u 

CP < 

03 

03 E- 

;-n 
—I 

CP > 



CP 
03 

CP liZ 



CJ — r\; 



03 
03 



CJ 

4-J 

cj — 



CP LJ 



cn 



CP > 



o; ^ 
•03 k: j-) 



CT r\J re CO 

c CO o ' cr d o 



03 «3- 

4- 1 >- r» 



— 1 3 
CJ — CO 



CP' c 



SUBSTITUTE SHEET (RULE 26) 



WO 96/27675 



PCT/GB96/00481 



. -,JJ*:<) Ad V" 





cu 

QJ 
JZ 



en 



SUBSTITUTE SHEET (RULE 26) 




SUBSTITUTE SHEET (RULE 26) 



INTERNATIONAL SEARCH R£P( 



A. p^«SIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/82 C12N15/12 



ln« -Oil Application No 

PCT/GB 96/00481 



C07K14/435 G01N33/52 G01N33/53 



Accorxtng »o t ntemaoonal Patent Claafic aaon (|PQ 
B- FIELDS SEARCHED 



or to both national claanficjoon mj IPC 

Minimum flmimrnn - in n >■ 1 1 i , ■ 

I PC 6 C 12N C07K SiT* ^ '°"°" d by dliar,canon symbo,s " 



Documenlioon searched other than 



n-n,™.,. documentation to the cx*n, that documems ^ m ^ ^ 



Electronic data base consulted dunng the 



international Kirch (name of data base and. «w, ! ~ 

— u*h: ana, wno-c practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVA 



Category 



P.X 



NT 



Guoon of document, wti wdicaoon. 



r appropriate, of the relevant passages 



P.X 



P.X 



TRENDS IN GENETICS , 
vol. 11, no. 8, August 1995, 
pages 328-329, XP002003595 
HASELOFF J., E T AL.: "GFP in plants- 
see the whole document ^ams 

CURR. BIOL. (1996), 6(3), 325 

pages 325-330, XP000571865 ' 

CHUI W., ET AL.: "Engineered GFP as a 

vital reporter plants" 

see the whole document 

WO A, 95 07463 (UNIV COLUMBIA -WOODS HOI F 

uZT&i mi (us): 

see the whole document 



■/-- 



continuation of box C. 



| X| Funhtr documents arc Used in the 
' Speau caiegona of aud documents : 

u ™ " <xner «P*oai ruion (u specified) 
°' t^er^^ 1 10 " "» «• «*"b.oon or 



28 May 1996 



Relevant 10 claim No. 



1-5,10, 
20.21,24 



1-5,10, 
20,21,24 



|X [ Pltou '»mily members are lined in annex. 



« J^T,* pU ^" hed ^ ^rn.Dooal film, dale 
^^on ^ Ule """"P" or "nderlyuig tf,, 

"vo.veaninveno.^^r^ri^cne 
t^ QVi " reltv »"«'- «* Maimed ,„ve noon 

rneja^ch combmaoon being obv,ous to . person Stod 



Date of muling of the iniemaoonil search report 



0 7. 06. 95 



Authorized office: 



form PC7 ISA. JIO I 



Maddox, A 



INTERNATIONAL SEARCH REPOI 



Inici 



rul Application No 



PCT/GB 96/00481 



C(Cononu»oon> DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ' C1UQ00 of document, with indication, where appropnate, of the relevant paaajes 



Relevant to claim No. 



P.X 



P.X 



PLANT CELL REPORTS, 

vol. 14, April 1995, 

pages 403-406, XP000571886 

NIEDZ, R.P., ETAL.: "Green fluorescent 

protein : an in vivo reporter of plant 

gene expression" 

see the whole document 

THE PLANT JOURNAL, 

vol. 8, no. 5, November 1995, 

pages 777-784, XP002003596 

SHEEN, J., ETAL.: "Green-fluorescent 

protein as a new vital marker in plant 

cells" 

see the whole document 
SCIENCE, 

vol. 263, 11 February 1994, 

pages 802-805, XP002003599 

CHLAFIE, M. , ET AL.: "Green fluorescent 

protein as a marker for gene expression" 

see the whole document 

W0, A, 91 01305 (UNIV WALES MEDICINE) 7 
February 1991 
see claim 10 

NATURE, 

vol . 369, June 1994, 

pages 400-403, XP002003600 

WANG, S., ET AL.: "Implications for bed 

mRNA localization from spatial 

ditribution of exu protein in Drosophila 

oogenesi s" 

see the whole document 



23,24 



23,24 



1-23 



1-23 



23 



Form PCT1SA.310 (a>*UAW40o* of Mawtd imk) (Jufy IW) 



page 2 of 2 



RNATIONAL SEARCH REPOI 



Intel oil Application No 

PCT/GB 96/00481 



Patent document 
cued in search report 



Publication 
date 



Patent family 
member(i) 



Publication 
date 



WO-A-9507461 




US-A- 
AU-B- 
CA-A- 


5491084 
7795794 
2169298 


13-02-96 
27-03-95 
16-03-95 


W0-A-9101305 


07-02-91 


AU-B- 


6054590 


22-02-91 






EP-A- 


0484369 


13-05-92 






JP-T- 


5501862 


08-04-93 



Fo«to PCT ISA. 110 tp«um family inau) (July lt»3] 



